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IS  ABSTRACT 

^Theorems  are  given  concerning  the  order  (i.e.,  rate)  of  convergence  of  a  succes¬ 
sive  interpolation  process  for  finding  simple  zeros  of  a  function  or  its  derivatives, 
using  only  function  evaluations.  Special  cases  include  the  successive  linear  inter¬ 
polation  process  for  finding  zeros,  and  a  parabolic  interpolation  process  for  finding 
turning  points.  Results  on  interpolation  and  finite  differences  include  weakening  the 
hypotheses  of  a  theorem  of  Ralston  on  the  derivative  of  the  error  in  Lagrangian  inter¬ 
polation. 

The  theoretical  results  are  applied  to  given  algorithms  for  finding  zeros  or  local] 
minima  of  functions  of  one  variable,  in  the  presence  of  rounding  errors.  The  algoritbm| 
are  guaranteed  to  converge  nearly  as  fast  as  would  bisection  or  Fibonacci  search,  and 
in  most  practical  cases  convergence  is  su^xlinaar,  -and  much  faster  than  for  bisection 
or  Fibonacci  search.  \  L'l _ - — - — ' 

The  problem  of  findirif  a  global  minimum  of  a  function  f  ,  of  one  variable,  is 
investigated.  We  give  a  nearly  optimal  algorithm  which  is  applicable  if  an  upper  bound| 
on  f"  is  known.  A  generalization,  useful  in  practice  if  n  <  5  ,  is  given  for 
functions  of  n  variables.  The  effect  of  rounding  errors  in  these  algorithms  can  be 
accounted  for. 

Finally,  we  present  a  modification  of  Powell's  algorithm  for  finding  a  local 
minimum  of  a  function  of  several  variables  without  calculating  derivatives.  The  modi¬ 
fication  ensures  that  the  search  directions  cen  not  become  linearly  dependent,  and 
numerical  examples  suggest  that  the  algorithm  compares  favorably  with  other  methods 
which  do  not  require  derivatives. 

A  bibliography  on  unconstrained  minimization  is  given,  and  AKrOL  implementations 
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of  all  the  above  algorithms  are  included. 
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Preface 


The  problem  of  finding  numerical  approximations  to  the  zeros  and 
extrema  of  functions,  using  hand  computation,  has  a  long  history.  In 
the  last  few  years,  considerable  progress  has  been  made  in  the  development 
of  algorithms  suitable  for  use  on  a  digital  computer.  The  aim  of  this 
work  is  to  suggest  improvements  to  seme  of  these  algorithms,  extend  the 
mathematical  theory  behind  them,  and  describe  some  new  algorithms  for 
approximating  local  and  global  minima.  The  unifying  thread  it:  that  all 
the  algorithms  considered  depend  entirely  on  sequential  function 
evaluations:  no  evaluations  of  derivatives  are  required.  Such  algorithms 
are  very  useful  if  derivatives  are  difficult  to  evaluate,  and  this  is 
often  true  in  practical  problems. 
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for  their  advice  and  encouragement  during  my  stay  at  Stanford,  and  for 
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members  of  my  reading  committee,  Professors  J.  G.  Herriot,  F.  W.  Dorr 
and  C.  B.  Moler,  for  their  careful  reading  of  various  drafts,  and  for 
many  helpful  suggestions. 

Several  people  have  contributed  to  this  work.  I  would  particularly 
like  to  thank  Dr.  T.  J.  Rivlin  for  suggesting  how  to  find  bounds  on 
polynomials  (Chapter  6),  and  Dr.  J.  H.  Wilkinson  for  introducing  me  to 
Dekker’s  algorithm  (Chapter  4).  Also,  thanks  to  Professor  F.  Dorr  and 
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Introduct  ion  and  Summary 
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1.  Introduction 

Consider  the  problem  of  finding  an  approximate  zero  or  minimum  of 
a  function  of  one  real  variable,  using  limited-precision  arithmetic  on  a 
sequential  digital  computer.  The  function  f  may  not  be  differentiable, 
or  the  derivative  f*  may  be  difficult  to  compute,  so  a  method  which 
uses  only  computed  values  of  f  is  desirable.  Since  an  evaluation  of 
f  may  be  very  expensive  in  terms  of  computer  time,  a  good  method  should 
guarantee  to  find  a  correct  solution,  to  within  some  prescribed  tolerance, 
using  only .a  small  number  of  function  evaluations.  Hence,  we  study 
algorithms  which  depend  on  evaluating  f  at  a  small  number  of  points, 
and  for  which  certain  desirable  properties  are  guaranteed,  even  in  the 
presence  of  rounding  errors. 

Slow,  safe  algorithms  are  seldom  preferred  in  practice  to  fast 
algorithms  which  may  occasionally  fail.  Thus,  we  want  algorithms  which 
are  guaranteed  to  succeed  in  a  reasonable  time  even  for  the  most  "difficult" 
functions,  yet  are  as  fast  as  commonly  used  algorithms  for  "easy" 
functions.  For  example,  bisection  is  a  safe  method  for  finding  a  zero 
of  a  function  which  changes  sign  in  a  given  interval,  but  from  our  point 
of  view  it  is  not  an  acceptable  method,  because  it  is  just  as  slow  for 
any  function,  no  matter  how  well  behaved,  as  it  is  in  the  worst  possible 
case  (ignoring  the  possibility  that  an  exact  zero  may  occasionally  be 
found  by  chance).  As  a  contrasting  example,  consider  the  method  of 
successive  linear  interpolation,  which  converges  superlinearly  to  a 
simple  zero  of  a  function,  provided  that  the  initial  approximation 

is  good  and  rounding  errors  are  unimportant.  This  method  is  not 
acceptable  either,  for,  in  practice,  we  may  nave  no  way  of  knowing  in 
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advance  if  the  zero  is  simple,  if  the  initial  approximation  Is  sufficiently 
good  to  ensure  convergence,  or  what  the  effect  of  rounding  errors  will  be. 

In  Chapter  4  we  describe  an  algorithm  which,  by  combining  some  of 
the  desirable  features  of  bisection  and  successive  linear  interpolation, 
does  come  close  to  satisfying  our  requirements:  it  is  guaranteed  to 
converge  (i.e.,  halt)  after  a  reasonably  small  number  of  function 
evaluations,  and  the  rate  of  convergence  for  well-behaved  functions 
is  so  fast  that  a  less  reliable  algorithm  is  unlikely  to  be  preferred 
on  grounds  of  speed. 

An  analogous  algorithm,  which  finds  a  local  minimum  of  a  function 
of  one  variable  by  a  combination  of  golden  section  search  and  successive 
parabolic  interpolation,  is  described  in  Chapter  5*  This  algorithm 
fails  to  completely  satisfy-  one  of  our  requirements:  in  certain 
applications  where  repeated  one -dimensional  minimizations  are  required, 
and  where  accuracy  is  not  very  important,  a  faster  (though  less  reliable) 
method  is  preferable.  One  such  application,  finding  local  minima  of 
functions  of  several  variables  without  calculating  derivatives,  is 
discussed  in  Chapter  7«  Note  that,  wherever  we  consider  minima,  we 
could  equally  well  consider  maxima. 

Most  algorithms  for  minimizing  a  nonlinear  function  of  one  or  more 
variables  find,  at  best,  a  local  minimum.  For  a  function  with  several 
local  minima,  there  is  no  guarantee  that  the  local  minimum  found  is  the 
global  (i.e.,  true  or  lowest)  minimum.  Since  it  is  the  global  minimum 
which  is  of  interest  In  most  applications,  this  is  a  serious  practical 
disadvantage  of  most  minimization  algorithms,  and  our  algorithm  given 
in  Chapter  5  is  no  exception.  The  usual  remedy  is  to  try  several 
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different  starting  points  and,  perhaps,  vary  some  of  the  parameters  of 
the  minimization  procedure,  in  the  hope  that  the  lowest  local  minimum 
found  is  the  global  minimum.  This  approach  is  inefficient,  as  the  same 
local  minimum  may  be  found  several  times,  and  it  is  also  unreliable,  for, 
no  matter  how  many  starting  points  are  tried,  it  is  impossible  to  be 
quite  sure  that  the  gi.obal  minimum  has  been  found. 

In  Chapter  6  we  discuss  the  problem  of  finding  the  global  minimum 
to  within  a  prescribed  tolerance.  It  is  possible  to  give  an  algorithm 
for  solving  this  problem,  provided  that  a  little  a  priori  information 
about  the  function  to  be  minimized  is  known.  We  describe  an  efficient 
algorithm,  applicable  if  an  upper  bound  on  f"  is  known,  and  we  show 
how  this  algorithm  can  be  used  recursively  to  find  the  global  minimum 
of  a  function  of  several  variables.  Unfortunately,  because  the  amount 
of  computation  involved  increases  exponentially  with  the  number  of 
variables,  this  is  practically  useful  only  for  functions  of  less  them 
four  variables.  For  functions  of  more  variables,  we  still  have  to 
resort  to  the  unreliable  "trial  and  error"  method,  unless  special 
information  about  the  function  to  be  minimized  is  available. 

Thus,  we  are  led  to  consider  practical  methods  for  finding  local 
(unconstrained)  minima  of  functions  of  several,  variables.  As  before,  we 
consider  methods  which  depend  on  evaluating  the  function  at  a  small 
number  of  points.  Unfortunately,  without  imposing  very  strict  conditions 
on  the  functions  to  be  minimized,  it  is  not  possible  to  guarantee  that 
an  n-dimensional  minimization  algorithm  produces  results  which  are  correct 
to  wlchin  seme  prescribed  tolerance,  or  that  the  effect  of  rounding  errors 
has  completely  been  taken  into  account.  We  have  to  be  satisfied  with 
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algorithms  which  nearly  always  give  correct  results  for  the  functions 
likely  to  arise  in  practical  applications. 

As  suggested  by  the  length  of  our  bibliography,  there  has  recently 
been  considerable  interest  in  the  unconstrained  minimization  problem. 
Thus,  we  can  hardly  expect  to  find  a  good  method  which  is  completely 
unrelated  to  the  known  ones.  In  Chapter  7  we  take  one  of  the  better 
methods  which  does  not  use  derivatives,  that  of  Powell  (1964),  and  modify 
it  to  try  to  overcome  some  of  the  difficulties  observed  in 
the  literature.  Numerical  tests  suggest  that  our  proposed  method  is 
faster  than  Powell's  oiiginal  method,  and  just  as  reliable.  It  also 
compares  quite  well  with  a  different  method  proposed  by  Stewart  (1967), 
at  least  for  functions  of  less  than  ten  variables.  (We  have  no  numerical 
results  for  non -quadratic  functions  of  more  than  ten  variables.) 

ALGOL  implementations  of  all  the  above  algorithms  are  given.  Most 
testing  was  done  with  ALGOL  W  (Wirth  and  Hoare  (1966))  on  IBM  360/ 67  and 
360/91  computers.  As  ALGOL  W  is  not  widely  used,  we  give  ALGOL  60 
procedures  (Naur  (1963)),  except  for  the  n-dimensional  minimization 
algorithm.  FORTRAN  subroutines  for  the  one-dimensional  zero-finding 
and  local  minimization  algorithms  are  also  available. 

To  recapitulate,  we  describe  algorithms,  and  give  ALGOL  procedures, 
for  solving  the  following  problems  efficiently,  using  only  function  (not 
derivative)  evaluations: 


Finding  a  zero  of  a  function  of  one  variable  if  an  interval  in  which 
the  function  changes  sign  is  given; 

Finding  a  local  minimum  of  a  function  of  one  variable,  defined  on  a 
given  interval; 
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3«  Finding,  to  within  a  prescribed  tolerance,  the  global  minimum  of 
a  function  of  one  or  more  variables,  given  upper  bounds  on  the 
second  derivatives; 

k.  Finding  a  local  minimum  of  a  function  of  several  variables. 

For  the  first  three  algorithms,  rigorous  bounds  on  the  error  and  the 
number  of  function  evaluations  required  are  established,  taking  the 
effect  of  rounding  errors  into  account.  Some  results  concerning  the 
order  of  convergence  of  the  first  two  algorithms,  and  preliminary- 
results  on  interpolation  and  divided  differences,  are  also  of  interest. 


2.  Summary 

In  this  section  we  summarize  the  main  results  of  the  following 
chapters.  A  more  detailed  discussion  is  given  at  the  appropriate 
places  in  each  chapter.  This  summary  is  intended  to  serve  as  a  guide 
to  the  reader  who  is  interested  in  some  of  our  results,  but  not  in 
others.  To  assist  such  a  reader,  an  attempt  has  been  made  to  keep  each 
chapter  as  self-contained  as  possible. 

Chapter  2 

In  Chapter  2  we  collect  some  results  on  Taylor  series,  Lagrangian 
interpolation,  and  divided  differences.  Most  of  these  results  ere  needed 
in  Chapter  3,  and  the  casual  reader  might  prefer  to  skip  Chapter  2  and 
refer  back  to  it  when  necessary.  Some  of  the  results  are  similar  to 
classical  ones,  but  instead  of  assuming  that  f  has  n+1  continuous 
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derivatives,  we  only  assume  that  f^  is  Lipschitz  continuous,  and 
the  term  f^n+^(|)  in  the  classical  results  is  replaced  by  a  number 
bounded  in  absolute  value  by  a  Lipschitz  constant.  For  example. 

Lemmas  2.3.1,  2.3*2,  2.4.1,  and  2.5*1  are  of  this  nature.  Since  a 
Lipschitz  continuous  function  is  differentiable  almost  everywhere, 
these  results  are  not  surprising,  although  they  have  not  been  found  in 
the  literature,  except  where  references  are  given.  (Sometimes  Lipschitz 
conditions  are  imposed  on  the  derivatives  of  functions  of  several 
variables:  see,  for  example,  Armijo  (1966)  and  McCormick  (1969).)  The 
proofs  are  mostly  similar  to  those  for  the  classical  results. 

Theorem  2.6.1  is  a  slight  generalization  of  some  results  of 
Ralston  (1963,  1965)  on  differentiating  the  error  in  Lagrangian 
interpolation.  It  is  included  both  for  its  independent  interest,  and 
because  it  may  be  used  to  prove  a  slightly  weaker  form  of  Lemma  3.6.1 
for  the  important  case  q  =  2  .  (A  similar  proof  is  sketched  in 
Kowalik  and  Osborne  (1968).) 

An  interesting  result  of  Chapter  2  is  Theorem  2.6.2,  which  gives 
an  expression  for  the  derivative  of  the  error  in  Lagrangian  interpolation 
at  the  points  of  interpolation.  A  well-known  weaker  result  is  that  the 
conclusion  of  Theorem  2.6.2  holds  if  f  has  n+1  continuous  derivatives, 
but  Theorem  2.6.2  shows  that  it  is  sufficient  for  f  to  have  n 
continuous  derivatives. 

Theorem  2.5*1,  which  gives  an  expansion  of  divided  differences,  may 
be  regarded  as  a  generalization  of  Taylor's  theorem.  It  is  used  several 
times  in  Chapter  3:  for  example,  see  Theorem  3*4.1  and  Lemma  3*6.1. 
Theorem  2.5*1  is  useful  for  the  analysis  of  interpolation  processes 
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whenever  the  coefficients  of  the  interpolation  polynomials  can  conveniently 
he  expressed  3n  terms  of  divided  differences. 

Chapter  3 

In  Chapter  3  we  prove  sane  theorems  which  provide  a  theoretical 
foundation  for  the  algorithms  described  in  Chapters  4  and  5*  In 
particular,  we  show  when  the  algorithms  will  converge  super  linearly, 
and  what  the  order  (i.e.,  rate)  of  convergence  will  be.  Of  course,  for 
these  results  the  effect  of  rounding  errors  is  ignored.  The  reader 
whose  main  intereet  is  the  practical  applications  of  our  results  might 
omit  Chapter  3,  except  for  the  numerical  examples  (Section  3-9 )  and  the 
summary  (Section  3.10). 

So  that  results  concerning  successive  linear  interpolation  for 
finding  zeros  (used  in  Chapter  4),  and  successive  parabolic  interpolation 
for  finding  turning  points  (used  in  Chapter  5)  ,  can  be  given  together, 
we  consider  a  more  general  process  for  finding  a  zero  of  ,  for 

any  fixed  q  >  1  .  Successive  linear  interpolation  and  successive 
parabolic  interpolation  are  just  the  special  cases  q  =  1  and  q  -  2  . 
Another  case  which  is  of  some  practical  interest  is  q  =  3  ,  for  finding 
inflexion  points.  As  the  proofs  for  general  q  are  essentially  no  more 
difficult  than  for  q  =  2  ,  most  of  our  results  are  for  general  q  . 

For  the  applications  in  Chapters  4  and  5,  the  most  important 
results  are  Theorem  3*4.1,  which  gives  conditions  under  which  convergence 
is  superlinear,  and  Theorem  3.5*1  >  which  shows  when  the  order  is  at  least 
1.6l8...  (for  q  =  1)  or  1.324...  (for  q  =  2)  .  These  numbers  are 
well-known,  but  our  assumptions  about  the  differentiability  of  f  are 
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weaker  than  those  of  previous  authors,  e.g.,  Ostrovski  (1966)  and 
Jarratt  (1967,  1968) . 

Fran  a  mathematical  point  of  view,  the  most  interesting  result 
of  Chapter  3  is  Theorem  3*7*1*  The  result  for  q  =  1  is  given  in 
Ostrovski  (1966),  except  for  our  slightly  weaker  assumption  about  the 
smoothness  of  f  .  For  q  =  2  ,  our  result  that  convergence  to  £  with 
order  at  least  1.378*  ••  is  possible,  even  if  iv  '(£)  /  0  ,  appears  to 
be  new.  Jarratt  ( 1967 )  and  Kowalik  and  Osborne  (1968)  assume  that 


lim 
n  -*oo 


0  , 


(2.1) 


and  then,  from  Lemma  3*6.1,  the  order  of  convergence  is  1.324...  . 
However,  even  for  such  a  simple  function  as 


f(x)  =  2x5  +  x2 


(2.2) 


there  are  starting  points  xQ  ,  x^  and  x2  such  that  (2.1)  fails  to 
hold,  and  then  the  order  may  be  at  least  1.378...  .  We  should  point 
out  that  this  exceptional  case  is  unlikely  to  occur:  an  interesting 
conjecture  is  that  the  set  of  starting  points  for  which  it  occurs  has 
measure  zero. 

The  practical  conclusion  to  be  drawn  from  Theorem  3.7*1  is  that, 
if  convergence  is  to  be  accelerated,  then  the  result  of  Lemma  3*6.1 
should  be  used.  In  Section  3*8  we  give  one  of  the  many  ways  in  which 
this  may  be  done.  Finally,  some  numerical  examples  illustrating  both  the 
accelerated  and  unaccelerated  processes  are  given  in  Section  3*9* 
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Chapter  4 

In  Chapter  4  we  describe  an  algorithm  for  finding  a  zero  of  a 
function  which  changes  sign  in  a  given  interval.  The  algorithm  is 
based  on  a  combination  of  successive  linear  interpolation  and  bisection, 
in  much  the  same  way  as  "Dekker* s  algorithm”  (van  Wijngaarden,  Zonneveld 
and  Dijkstra  (1963) ,  Wilkinson  (1967),  Peters  and  Wilkinson  (1969), 

Dekker  (1969)).  Our  algorithm  never  converges  much  slower  than  bisection, 
whereas  Dekker* s  algorithm  may  converge  extremely  slowly  in  certain  cases. 
(Examples  are  given  in  Section  4.2.) 

It  is  well-known  that  bisection  is  the  optimal  algorithm,  in  a 
minimax  sense,  for  finding  zeros  of  functions  which  change  sign  in  an 
interval.  (We  only  consider  sequential  algorithms:  see  Robbins  (1952) , 

Wilde  (1964)  and  Section  4.5.)  The  motivation  for  both  our  algorithm  and 
Dekker*  s  is  that  bisection  is  not  optimal  if  the  class  of  allowable 
functions  is  suitably  restricted.  For  example,  it  is  not  optimal  for 
convex  functions  (Bellman  and  Dreyfus  (1962),  Gross  and  Johnson  (1959) )> 
or  for  functions  with  simple  zeros. 

Both  our  algorithm  and  Dekker* s  exhibit  superlinear  convergence  to 
a  simple  zero  of  a  C1  function,  for  eventually  only  linear  interpolations 
are  performed,  and  the  theorems  of  Chapter  5  are  applicable.  Thus, 
convergence  is  usually  much  faster  than  for  bisection.  Our  algorithm 
incorporates  inverse  quadratic  interpolation  as  well  as  linear  interpolation, 
so  it  is  often  slightly  faster  than  Dekker' s  algorithm  on  well-behaved 
functions  (see  Section  4.4). 


10 


1.2 


Chapter  5 

An  algorithm  for  finding  a  local  minimum  of  a  function  of  one 
variable  is  described  in  Chapter  5*  The  algorithm  combines  golden 
section  search  (Bellman  (1957) >  Kiefer  (1953),  Wilde  (1964),  Witzgall 
(1969))  and  successive  parabolic  interpolation  (the  case  q  =  2  of  the 
process  analysed  in  Chapter  3),  in  the  same  way  as  bisection  and  successive 
linear  interpolation  are  combined  in  the  zero-finding  algorithm  of 
Chapter  4.  Convergence  in  a  reasonable  number  of  function  evaluations 
is  guaranteed  (see  Section  5*5),  and,  for  a  C  function  with  positive 
curvature  at  the  minimum,  the  results  of  Chapter  3  show  that  convergence 
is  superlinear,  if  we  ignore  rounding  errors  and  suppose  that  the  minimum 
is  at  an  interior  point  of  the  interval.  Other  algorithms  given  in  the 
literature  either  fail  to  have  these  two  desirable  properties,  or,  when 
convergence  is  strictly  superlinear,  the  order  of  convergence  is  less 
than  for  our  algorithm  (see  Sections  5.4  and  5.5). 

In  Sections  5 •?  and  5*3  we  consider  the  effect  of  rounding  errors. 
Section  5*2  contains  an  analysis  of  the  limitations,  imposed  by  rounding 
errors,  on  the  attainable  accuracy  of  any  algorithm  which  is  based 
entirely  on  function  evaluations,  and  this  section  should  be  studied 
by  the  reader  who  intends  to  use  the  ALGOL  procedure  given  in  Section  5*8* 

If  f  is  unimodal,  then  our  algorithm  will  find  the  unique  minimum, 
provided  there  are  no  rounding  errors.  To  study  the  effect  of  rounding 
errors,  we  define  "  6 -unimodal"  functions.  A  unimodal  function  is  6 -unimodal 
for  all  8  >  0  ,  but  a  computed  approximation  to  a  unimodal  function  can 
not  be  unimodal:  it  will  be  6 -unimodal  for  some  positive  8  ,  depending 
on  the  function  and  on  the  precision  of  computation.  (8  0  as  the 
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precision  increases  indefinitely.)  We  prove  some  theorems  about  5-unimodal 
functions,  and  give  a  bound  for  the  error  in  the  approximate  minimum  found 
by  our  algorithm  when  applied  tn  a  5-unimodal  function.  In  this  way  we 
can  justify  the  use  of  our  algorithm  in  the  presence  of  rounding  errors, 
and  account  for  their  effect.  Our  motivation  is  rather  similar  to  that 
of  Richman  (1968)  in  developing  the  e-calculus,  but  we  are  not  concerned 
with  properties  that  hold  as  e  -*  0  .  The  reader  who  is  not  very 
interested  in  the  effect  of  rounding  errors  might  prefer  to  skip 
Section  5*3. 

Chapter  6 

In  Chapter  6  we  consider  the  problem  of  finding  an  approximation 
to  the  global  minimum  of  a  function  f  ,  defined  on  a  finite  interval, 
if  some  a  priori  information  about  f  is  known.  This  interesting  problem 
does  not  seem  to  have  received  much  attention,  although  there  have  been 
some  empirical  investigations,  e.g.,  see  Magee  (i960).  In  Section  6.1, 
we  show  why  sane  a  priori  information  is  necessary,  and  discuss  sane  of 
the  possibilities.  In  the  remainder  of  the  chapter  we  restrict  our 
attention  to  the  case  where  an  upper  bound  on  f"  is  known. 

An  algorithm  for  global  minimization  of  a  function  of  one  variable, 
applicable  when  such  an  upper  bound  on  the  second  derivative  is  known,  is 
described  in  Section  6.3.  The  basic  idea  of  this  algorithm  is  used  by 
Rivlin  (1970)  to  find  bounds  on  a  polynomial  in  a  given  interval.  We 
pay  particular  attention  to  the  problem  of  giving  guaranteed  bounds  in 
the  presence  of  rounding  errors,  and  the  casual  reader  may  find  the 
details  in  the  3.ast  half  of  Section  6.3  rather  indigestible. 
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In  Section  6.b,  we  try  to  obtain  sane  insight  into  the  behaviour 
of  our  algorithm  by  considering  scxne  tractable  special  cases.  Then,  in 
Sections  6. 5  and  6.6,  we  show  that  no  algorithm  which  uses  only  function 
evaluations  and  an  upper  bound  on  f"  could  be  much  faster  than  our 
algorithm.  Finally,  a  generalization  to  functions  of  several  variables 
is  given  in  Section  6.8.  The  conditions  on  f  are  much  weaker  than 
unimodality  (Newman  (1965)).  The  generalization  is  not  practically  useful 
for  functions  of  more  than  three  variables,  and  it  is  an  open  question 
whether  a  significantly  better  algorithm  is  possible. 

Chapter  7 

In  Chapter  7  we  describe  a  modification  of  Powell’s  (196^)  algorithm 
for  finding  a  local  minimum  of  a  function  of  several  variables,  without 
calculating  derivatives.  The  modification  is  designed  to  ensure 
quadratic  convergence,  and  to  avoid  the  difficulties  with  Powell's 
criterion  for  accepting  new  search  directions. 

First,  a  brief  introduction  to  the  problem  and  a  survey  of  the 
recent  literatureare  given  in  Section  7.1.  The  effect  of  rounding  errors 
on  the  limiting  accuracy  attainable  is  discussed  in  Section  7*2.  Powell's 
algorithm  is  described  in  Section  7*3;  and  our  main  modification  is  given 
in  Section  7*^*  The  idea  of  the  modification  (finding  the  principal  axes 
of  an  approximating  quadratic  form)  is  not  new:  for  example,  it  is  used 
by  Greenstadt  (1967)  in  his  quasi-Newton  method.  Unlike  Greenstadt, 
though,  we  do  not  use  an  explicit  approximation  to  the  Hessian  matrix. 

An  interesting  feature  of  our  modification  is  that  it  is  posible  to  avoid 
squaring  the  condition  number  of  the  eigenvalue  problem  by  using  a  singular 
value  decomposition:  see  Section  7*^  for  the  details. 
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In  Sections  7*5  and  J.6  we  describe  some  additional  features  of  our 
algorithm.  Then,  in  Section  7*7>  we  give  the  results  of  seme  numerical 
experiments,  and  compare  our  method  vith  those  of  Powell  (1964),  Davien, 
Swann  and  Carapey  (Swann  (1964)),  and  Stewart  (1967)  •  For  the  comparison 
we  have  used  numerical  results  obtained  by  Fletcher  (1965)  and  Stewart. 
(1967)  •  The  numerical  results  suggest  that  our  algorithm  is  competitive 
with  the  currently  used  algorithms  which  do  not  require  the  user  to 
compute  derivatives,  although  it  is  difficult  to  reach  a  definite 
conclusion  without  more  practical  experience. 

Finally,  we  give  a  bibliography  of  the  recent  literature  on 
nonlinear  minimization,  with  the  emphasis  being  on  methods  for  solving 
unconstrained  problems. 


2.1 


1.  Introduction 

In  this  chapter  we  collect  some  results  which  are  needed  in  Chapters 
3  and  6.  The  reader  who  is  mainly  interested  in  the  practical  applications 
described  in  Chapters  4  to  7  might  prefer  to  skip  this  chapter,  except  for 
Section  2,  and  refer  back  to  it  when  necessary. 

Classical  expressions  for  the  error  in  truncated  Taylor  series  and 
Lagrangian  interpolation  often  involve  a  term  f(n+1)(£)  ,  where  |  is  an 
unknown  point  in  some  interval.  For  such  expressions  to  be  valid,  f  must 
have  n+1  derivatives.  Several  of  the  results  of  this  chapter  give 
expressions  which  are  valid  if  f^  satisfies  a  (possibly  one-sided) 
Lipschitz  condition.  In  these  results,  the  term  f(n+1) (|)  is  replaced 
by  a  number  which  is  bounded  by  a.  Lipschitz  constant .  It  seems  unlikely 
that  these  results  are  new,  but  they  have  not  been  found  in  the  literature 
except  where  references  are  given. 

The  results  of  Chapter  3  depend  heavily  on  Theorem  5*1>  which  gives 
an  expansion  of  the  divided  difference  f[x^, ...,xn]  (see  Section  2)  near 
the  origin.  This  theorem,  and  the  less  cumbersome  Corollary  5»1>  are 
useful  for  the  analysis  of  interpolation  processes,  for  the  coefficients 
of  the  interpolating  polynomials  can  be  expressed  in  terms  of  divided 
differences  (see  Chapter  3). 

Finally,  in  Section  6,  we  extend  seme  results  of  Ralston  (1963)  on 
the  derivative  of  the  error  term  in  Lagrangian  interpolation.  These 
results  are  >*e.levant  to  Chapter  3,  although  they  are  given  mainly  for 
their  independent  interest.  Perhaps  the  most  interesting  result  is 
Theorem  6.2,  which  shows  that,  if  we  are  only  concerned  with  the  points 
of  interpolation,  then  we  can  differentiate  the  classical  expression  for 
the  error  (equation  (6.4)),  regarding  the  term  f^(£(x))  as  constant. 


16 


2.2 


This  is  well-known  if  f  has  n+1  continuous  derivatives,  but  Theorem  6.2 
shows  that  it  is  sufficient  for  f  to  have  n  continuous  derivatives. 


2.  Notation  and  definitions 

Throughout  this  chapter  [a,b]  is  a  nonempty,  finite,  closed 
interval,  and  f  is  a  real- valued  function  defined  on  [a,b]  .  n  is 
a  nonnegative  integer,  M  a  nonnegative  real  number,  and  a  a  number 
in  (0,1]  . 

Definitions 

The  modulus  of  continuity  w(f;6)  of  f  (in  [a,b])  is  defined  by 

w(f;5)  =  sup  |f(x)-f(y)|  ,  (2.1) 

x,y  €  [a,b] 

|x-y|  <  5 

for  all  6  >  0  . 

If  f  has  a  continuous  n-th  derivative  on  [a,b]  ,  then  we  write 
f  eCn[a,b]  .  If,  in  addition,  f^  €  Lip^  a  ,  i.e., 

w(f(n);6)  <  M6a  (2.2) 

for  all  R  >0  ,  then  we  write  f  e  I£n[a,b;M,a].  (This  notation  is  not 
standard,  but  it  is  convenient  if  we  want  to  mention  the  constants  M 
and  a  explicitly.)  If  f  e  LCn[a,b;M,l]  then  we  write  simply 
f  eI£n[a,b;M]  . 

If  Xq,  .  ..,xn  are  distinct  points  in  [a,b]  ,  then  IP(f;x^, . .  .,x^) 

is  the  Lagrangian  interpolation  polynomial,  i.e.,  the  unique  polynomial 

of  degree  n  or  less  which  coincides  with  f  at  x^,  ...,x  .  The 

O  n 
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divided  difference  f[xQ,  .  ..,xn]  is  defined  by 

f( 

— n 

TT  (« 

i=0 

(There  are  many  other  notations:  see  for  example,  Milne  (1949), 

Milne -Thomson  (1953),  and  Traub  (1964).)  Note  that,  although  we  suppose 
for  simplicity  that  xQ,  •••,xn  are  distinct,  nearly  all  the  results  given 
here  and  in  Chapter  3  hold  if  seme  of  xQ,  ...,xn  coincide.  (We  then  have 
Hermite  interpolation  and  confluent  divided  differences:  see  Traub  (1964).) 
For  the  statement  of  these  results,  the  word  "distinct"  is  enclosed  in 
parentheses . 


Newton’s  identities 

For  future  reference,  we  note  the  following  useful  identities  (see 
Cauchy  (1840),  Isaacson  and  Keller  (1966),  or  Traub  (1964)).  The  first 
is  often  used  as  the  definition  of  the  divided  difference  f[x^,  ...,x  ]  , 
while  the  second  gives  an  explicit  representation  of  the  interpolating 
polynomial  and  remainder. 


1.  f[xQ]  =  f(xQ) 
and,  for  n  >  1  , 


f  [x^, . . . , x^] 


f[x0,...,xn_i]  -  f[x1,...,xn] 
X0  Xn 


(2.4) 


2.  If  P  =  IP(f;x0, . .  .,xn)  ,  then 


f(x)  P(x)  +  |  TT  (x-x  )  ]  .  f[x 

U-o  1  J  0 


^..*^xn,x]  > 


(2.5) 


■] 


> 


R 

t 

L 
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P(x)  =  f[x03+  (x  -x^ftx^x^  +  ... 

+  (x-x0)...(x-xn_1)ftx0,...,xn] 


(2.6) 


3«  Truncated  Taylor  series 


In  this  section  we  give  some  forms  of  Taylor's  theorem.  Lemma  3-1 
is  needed  in  Chapter  6,  and  applies  if  f^  satisfies  a  one-sided 
Lipschitz  condition. 


Lemma  3*1 


Suppose  that  f  eC  [0,b]  for  seme  b  >  0  ,  and  that  there  is  a 
constant  M  such  that,  for  all  ye[0,bj  , 


f^(y)-f^(0)  < 


(5-1) 


Then,  for  all  xe[0,b]  , 


n  r  /  \  iiTi 

=  £  fr  f  r  (0)  +  — - m(x)  , 

r=0  r*  (nfl)! 


(3.2) 


where 


m(x)  <  M  . 


Remarks 


(3-3) 


The  proof  is  by  induction  on  n  ,  and  is  emitted.  The  corresponding 
two-sided  result  is  immediate,  and  is  generalized  in  Lemma  3.2  below.  In 
Lemma  3-2,  fractional  factorials  are  defined  in  the  usual  way,  so 


(n+a)!/a!  =  (1+ a) (2  + a) . . . (n  +  a) 


(3.U) 
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Lemma  3»2 

If  f  e  I£n[a,b;M,a]  and  x,ye[a,b]  ,  then 


n 

f(x)  =  f.  f^^(y)  +  |x-y|n+a  m(x,y)a»./(n+Q!)!  ,  (3-5) 

r=0 


where 


|m(x,y)  |  <  M 


(3.6) 


Remarks 

The  re suJLt  is  trivial  if  n  =  0  ,  and  for  n  >  1  it  follows  from 
Taylor's  theorem  with  the  integral  form  for  the  remainder,  using  the 
integral 

J  ~ - dt  =  x1™  o!/(n*a)!  (3-7) 

0  v  ' ' 


for  x  >  0  . 

Note  that  the  bound  (3*6)  is  sharp,  as  can  be  seen  fran  the  example 
f(x)  =  x1***  ,  (3-8) 


with  y  =  0  and  M  =  (n+a)  l/a!  .  Since,  for  n  >  1  , 
n!  <  (n+a)  i/a!  , 


(3-9) 


the  bound  obtained  from  the  classical  result 


f(x)  .  nz  f<r>(y)  +l^l!  tW(S)  , 

/*  X  •  XI  • 

r=0 


(3.10) 


for  seme  |  between  x  and  y  ,  is  not  sharp. 
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The  foU.ovd.ng  lemma,  used  in  Chapter  6,  gives  a  one-sided  bound  on 

(n) 

the  error  in  Lagrangian  interpolation,  if  f'  '  satisfies  a  one-sided 
Lipschitz  condition.  Thus,  it  corresponds  to  Lemma  5.1.  The  corresponding 
two-sided  result  follows  from  Theorem  5  of  Baker  (1970),  but  the  proof 
given  here  is  simpler,  and  similar  to  the  usual  proof  of  the  classical 
result  that,  if  f  eCnf^[a,b]  ,  then  m(x)  =  f^n+1^(|(x))  ,  for  some 
|(x)  e  [a,b]  .  (See,  for  example,  Isaacson  and  Keller  (1966),  pg.  190.) 


Lemma  4.1 

Suppose  that  fe  Cn[a,b]  ;  xQ,  •••>xn  are  (distinct)  points  in 
[a,b]  ;  F  =  IP(f  ;xQ, . .  .,xn)  ;  and,  for  all  x,ye[a,b]  with  x  >y  , 

f (n)  (x)  -  f^(y)  <  M(x-y)  .  (4.1) 

Then,  for  all  xe  [a,b]  , 


f  (x)  =  P(x)  +  (  TT  (x  -  xr) 


- 


(1..2) 


where 


m(x)  <  M  . 


(4.5) 


Proof 

Suppose  that  n  >  0  and  x  f  x^  for  any  r  =  0,  ...,n  ,  for 
otherwise  the  result  is  trivial.  Let 


w(x)  =TT  (x-x  )  , 

r=0 


(4.4) 


and  write 


i  \  (  \  I  )  l  ]  I 


I—  -)  ~  L-J  L  i .  I  .)  u  -  J  C—J  L_J  l_J  3 - 1 


5.  Divided  differences 

Lemma  5*1  and  Theorem  5*1  are  needed  in  Chapter  3.  The  first  part 
of  Lemma  5*1  follows  immediately  from  Lemma  U.l  and  the  identity  (2.5) 
(we  state  the  two-sided  result  for  variety),  while  the  second  part  is 
well-known,  and  follows  similarly.  Theorem  5*1  is  more  interesting,  and 
most  of  the  results  of  Chapter  3  depend  on  it.  It  may  be  regarded  as  a 
generalization  of  Taylor*  s  theorem  (the  special  case  n  =  0)  . 
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Lemma  $.] 

Suppose  that  feLCn[a,b;M]  and  that  xQ,  •••>xn+1  are  (distinct) 


points 

in  [ a, b ]  .  Then 

f[xQ,  •••*xrrfl]  =  m/(nfl)i  , 

(5-1) 

where 

|mj  <  M  . 

(5-2) 

Furthermore, 

m 


if  f€Cn+1[a,b] 
=  f^U) 


,  then 


(5.3) 


for  sane  Z  e  [a,b]  . 


Theorem  $.1 

lr 

Suppose  that  k,n  >  0  ;  f  eC  [a,b]  ;  a  <  0  ;  b  >  0  ;  and 
xQ,  ...,xn  are  (distinct)  points  in  [a,b]  .  Then 


(5-5) 


for  some  Z  in  the  interval  spanned  by  x  ,  ...,x  and  0 

rl'***'rk  rl  rk 


23 


2.5 


CoroUlary  5«1 

If,  in  Theorem  5.1, 

5  =  max  |  x  j  , 


then 


r— 0,  «  • . ,  n 


(5.6) 


W  s  hth  «(f(ntk);») 


(5-7) 


Proof  of  Theorem  5*1 

The  result  for  k  =  0  is  immediate  from  the  second  part  of  Lemma  5.1, 

so  suppose  that  k  >  0  .  Take  points  y^,  which  are  distinct,  and 

distinct  frcm  x^, ...,x  .  Then 

0  n 


f[xQ*  •  •  « >xn]  “  *  *  *>yn] 


n 

£  tf[V  *  *  •’V^l’  •  "’vn]  "  f[x0'  *  *  ,,Xr-Tyr'  ‘  *  *,yn^ 
r=0 


(5-8) 


n 


=  £  (xr-yr)f[x0,...,xr,yr,...,yn]  , 


r=0 


(5.9) 


by  the  identity  (2.4) . 

We  may  suppose,  by  induction  on  k  ,  that  the  theorem  holds  if  k 
is  replaced  by  k-1  and  n  by  rH-1  .  Use  this  result  to  expand  each 
term  in  (5*9)>  and  consider  the  limit  as  yQ>...,yn  tend  to  0  .  By 
the  second  part  of  Lemma  5-1,  f[y0>...>yn]  tends  to  f^(0)/n!  ,  so 

the  result  fonows.  (Strictly,  to  show  the  existence  of  the  points 
i  ,  we  must  add  to  the  inductive  hypothesis  the  result  that 

rr  •••'rk 


f(n+k)^ 


)  is  a  continuous  function  of  x  ,...,x  .) 

‘k  rl  rk 
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Corollary  5*1  is  immediate, 
.(gtili -  terms  in  the  sum  (5*5)* 


n!k! 


once  we  note  that  there  are  exactly 


6.  Different iating  the  error 

The  two  theorems  in  this  section  are  concerned  with  differentiating 
the  error  term  for  Lagrangian  interpolation.  These  theorems  are  not 
needed  later,  but  are  included  for  their  independent  interest,  and  also 
because  they  may  be  used  to  give  altv  'native  proofs  of  seme  of  the  results 
of  Chapter  3  (see  Kcwalik  and  Osborne  (1968),  pp.  18-20). 

Theorem  6.1  is  given  by  Ralston  (1963,  1965)  if  f  eC^^ajb]  .  We 
state  the  result  under  the  slightly  weaker  assumption  that  f  e  I£n[a,b;M] 
for  some  M  :  the  only  difference  in  the  conclusion  is  that  Ralston* s 
term  f^nf^(T](x))  is  replaced  by  m(x)  ,  where  |m(x)  {  <M  .  The  proof 
is  similar  to  that  given  by  Ralston  (1963),  and  is  also  similar  to  the  proof 
of  Lemma  6.2  below,  so  it  is  omitted. 

Theorem  6.2  gives  an  expression  for  the  derivative  of  the  error  at 
the  points  of  interpolation.  If  f  e  I£n[ a,b;M]  then  the  result  follows 
immediately  from  Theorem  6.1,  but  Theorem  6.2  shows  that  f  eCn[a,b]  is 
sufficient.  This  result  may  be  of  some  independent  interest. 


Theorem  6.1 

Suppose  that  n  >  1 
points  in  [a,b]  ;  w(x) 
and  f(x)  =  P(x)  +  R(x)  . 
and  m:  [a,b]  -*  [  -M,M]  , 


;  f€LCn[a,b;M]  ;  xQ,  ^  are  (distinct) 

=  (x-x0)...(*-Vl)  ;  P  =  IP(f;V...,Vl)  ; 
Then  there  are  functions  | :  [a,b]  -♦  [a,b] 
such  that 
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1.  f'  '(|(x))  is  a  continuous  function  of  xe[a,b]  (although  £(x) 
is  not  necessarily  continuous); 

2.  m(x)  is  continuous  on  [a,b]  ,  except  possibly  at  xQ, ...jX^  ^  ; 

3 .  for  all  x  e  [ a, b ]  , 

R(x)  =  w(x)f(n)(£(x))/ni  (6.1) 

and 

R'(x)  =  w*(x)f^n^  (£(x))/nl  +  w(x)m(x)/(ntl)  I  ;  (6.2) 

and 

4.  if  x  f  x^  for  r  =  0, . ..,n-l  ,  then 

^f(n)(l(x))  .SM.  .  (6.5) 

Theorem  6.2 

Suppose  thau  n  >1  ;  f  eCn[a,b]  ;  x^, ...,xn  ^  are  (distinct) 

points  in  [a,b]  ;  w(x)  =  (x-xQ) . . . (x-xn_1)  ;  P  =  IP(f ;xQ, . . .,xnl)  ; 
and  f(x)  =  P(x)+R(x)  .  Then  there  is  a  function  [a,b]  -►  [a,b]  , 
such  that  f(n)(!(x))  is  a  continuous  function  of  xe[a,b]  ;  for  all 
x  e  [a,b]  , 

R(x)  =  w(x)f(n)(|(x))/n!  ;  (6.4) 

and,  for  r  =  0, ...,n-l  , 

R'(xr)  -w'(xr)f(,,)(i(xr))/n!  .  (6.5) 

Before  proving  Theorem  6.2,  we  need  some  lemmas.  Note  the  similarity 
between  Lemma  6.2  and  Theorem  6.1. 

Lemma  6.1 

Suppose  that  n  >  1  ;  f eCn[a,b]  ;  x^, ...,xn  are  distinct  points 

in  [a,b]  ;  P  =  IP(f ;xQ, . . .,xn)  ; 
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A  =  max  |f^  (x)  |  , 

xe[a,b] 


(6.6) 


ar^ 


5  =  max  jx.  -  x . 
0  <i  <j  <n  1  J 


(6-7) 


Then,  for  all  xe[a,b]  , 


f(x)  =  p(x)  +( TT  (x  "xr)  )  s(x)  > 


(6.8) 


r=0 


where 


ls«l  ^  sfr 


(6-9) 


Proof 

If  x  =  x^  for  some  r  =  0,  ...,n  ,  then  we  can  take  S(x)  =  0  . 
Otherwise,  by  the  identity  (2.5), 

S(x)  =  F[xQ, ...,xn,x]  .  (6.10) 


Write  x  ,  n  for  x 
rtf-1 


and  reorder  x_, . .  .,x  .  . 

0  n+1 


(if  necessary)  so  that. 


if  the  reordered  points  are  x^,...,x^^  ,  then 


x. 


0 


-  x 


n+1 


max 

0  <i  <  j  <n+l 


>  & 


(6.11) 


Fran  (6.10)  arid  the  identity  (2.h), 


S(x) 


f[^Q,  ...,X^]  -  f  [xj^,  .  .  .,  X^+^] 

X0  "  Xn+1 


(6.12) 


so,  by  Lemma  5*1; 


G' 


(x)  = 


f(n)  m 


n!  x' 


0 


n+1 


T 
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r 

! 

i 

I 

i 
i 

‘  for  some  5  and.  £'  in  [a,b]  .  In  view  of  (6.6)  and  (6.11),  the 
result  follows. 

Lemma  6.2 

Suppose  that  n  >2  ;  feCn[a,b]  ;  xQ, ...,xn  ^  are  distinct 

1  (n\ 

points  in  [a,bj  ;  A  =  max  |f^  '(x)|  ;  6  =  max  jx.  -x.|  ; 

xe[a,b]  0<i<j<n  1  J 

Pn  =  IP(f;xQ,  J  \(x)  =  (x-xQ) . . .  (x-xn_1)  ;  and 

f(x)  =  Pn(x)+R(x)  .  Then  there  is  a  function  £:  [a,b]  -*  [a,b]  such 

that,  for  all  xe[a,b]  ,  f^(|(x))  is  a  continuous  function  of  x  , 

B(x)  =  vn(x)f(n)(SW)/n!  ,  (6.14) 

|R'(x)  -  w^(x)f(n'(£(x))/n‘.|  <  — ^ -  ,  (6.15) 

and,  if  x  /  xr  for  r  =  0,  ...,n-l  ,  then 

|£f<n>({(x))|  <  f  .  (6.16) 


1 

i 


Proof 


Let  xn  be  a  point  in  [a,b]  ,  distinct  frcm  x 

For  k  =  n  or  n+1  ,  define 

and  xQ,...,xn<>1  . 

i 

Pk  =  IP(f;xQ, ...,xk_1) 

(6.17) 

* 

and 

s 

wk(x)  =  (x-xQ)  ...(x-x^^)  . 

(6.18) 

By  the  classical  result  corresponding  to  Lemma  4.1,  there  is  a  function 
|  such  that  (6.l4)  holds.  Suppose,  until  further  notice,  that  x  /  xr 
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for  r  =  0,  ...,n  .  Then,  from  (6.l4)  and  the  identity 


t.1  f(xr)wk(x) 


Pk^X7  "  (x-x  )w£.(x  )  * 

r=0  v  r7  kx  r7 


(6.19) 


we  have 


n! 


n-1 


_  /  \  li-j.  f(x  ) 

■M-z 


txj  "  4<  (x-x  )w'  (x  ) 
nv  7  r=0  r7  nv  r7 


(6.20) 


Since  the  right  side  of  (6.20)  is  continuously  differentiable  at  x  ,  so 
is  the  left  side,  and 


Mf(n)(«w)  - 


■  *  t 


f(xr) 


(x-x  )  w* (x  ) 
v  r7  n  r7 


(6.21) 


Define  S(x,x :  )  by 


f(x)  =  Pnfl(x)  +  Vl(x)S^x,Xn^ 


(6.22) 


Since 


w'  (x  )  = 
n+lv  r7 


w  (x  ) 
nv  n7 


if  r  =  n  , 


(x  -x  )w*(x  )  if  r  =  0, . . . ,n-l  , 
v  r  n7  nv  r7  ’ 


(6.23) 


equation  (6.19)  gives 


Px_i(x)  n-1  f(x  )  f(x  ) 

nfl^  7  _  y  _  v  r7  +  ___n____ 

w  _  (x)  “  (x-x  )  (x  -x  )w*  (x  )  (x-x  )w  (x  ) 

n+lv  7  r=0  '  r7V  r  n7  nv  r7  v  n7  nv  n7 


(6.24) 


so 


S(x,xn) 


f  (x)  f^Xn^ 

'  VJ\)  nd- 


x-x 


n 


f(xr) 

+  ^  (x-x  )(x  -x  )w‘  (x  )  *  (^*25) 

r=0  v  r7V  n  r7  n'  r7 
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As  xn  -*  x  ,  the  r  ight  side  of  (6.25)  tends  to  the  right  side  of  (6.21). 
Thus,  there  exists 


lira  S(x,x  ) 
n' 

x  -»x 
n 


=  —  —  f '■n'  ( i(x) ) 
n!  dx  ^  v  ’ 


(6.26) 


and,  from  the  definition  (6.22)  and  Lemma  6.1,  this  proves  (6.l6) .  Now, 
by  differentiating  the  right  side  of  (6.l4)  by  parts,  we  see  that  (6.15) 
holds,  in  fact 


w^(x)f‘(n)(5(x))  +  wn(x)  f(n)(S(x)) 


(6.27) 


provided  that  x  /  x^_  ,  for  r  =  0,  ...,n-l  .  Consider  (6.2])  near  one 
of  the  points  x.^  ,  r  -  0,  ...,n-l  .  R*(x)  is  continuous  at  xr  , 
wn(xr)  =  0  >  wA(Xr>  ^  0  *  by  (6*l6)>  ^  (i(x))  is  bounded 

for  x  /  xr  .  Thus  f^(5(x))  has,  at  worst,  a  removable  discontinuity 
at  x^  ,  and,  by  the  continuity  of  f^(£)  as  a  function  of  t  , 
a  suitable  redefinition  of  §(xr)  will  ensure  that  f^(S(x))  is  a 
continuous  function  of  x  ,  and  that 

R'(xr)  =  vj(xr)f,n>(!(xr))/n!  .  (6.28) 

This  completes  the  proof  of  the  lemma. 


Proof  of  Theorem  6.2 

If  n  >  2  then  the  result  follows  immediately  from  Lemma  6.2.  If 
n  =  1  ,  choose  £(x)  E0  that  £(xQ)  =  xQ  and,  for  x  /  xQ  , 


f’(|(x))  = 


f(x)  -  f(xQ) 


x  -  X 


(6.29) 


'0 


Then  ff(|(x))  is  a  continuous  function  of  xe[a,b]  ,  and,  as 
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R(x)  =  f(x)  -  f (Xq)  and  w(x)  =  x-x^  ,  it  is  easy  to  see  that 
equations  (6. 4)  and  (6.5)  are  satisfied.  Thus,  the  theorem  holds  for 
ai  l  n  >  1  . 
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Chapter  3* 


The  Use  of  Successive  Interpolation  for  Finding  Simple 
Zeros  of  a  Function  and  its  Derivatives 


l  f  l  I  i  (  J  I  )  I  I  i  J  f  J 
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1.  Introduction 


Suppose  that  q  >1  and  f€C^_1[a,b]  .  Given  (distinct)  points 

xQ,  .  ..,x^  in  [a,b]  ,  a  sequence  (xr)  may  be  defined  in  the  following 

.,x  ,  ) 

7  n+q' 


way:  if  x^,  ...,xmJ^  are  already  defined,  let  P^  =  IP(f;x^, 


n+q 


be  the  q-th  degree  polynomial  which  coincides  with  f  at  x^, 

and  choose  x  .  ,  n  so  that 
n+q+1 

p(q~1)(x  )  =  0 

n  v  n+q+17 


,,x  , 

7  n+q  7 


(VD 


Under  certain  conditions  the  sequence  (x  )  is  well-defined  bj^'fl.l), 
lies  in  [a,b]  ,  and  converges  to  a  zero  £  of  f^  .  I^^this  chapter 
we  give  sufficient  conditions  for  convergence,  and^^timate  the  asymptotic 
rate  of  convergence,  making  various  assumptions  about  the  differentiability 
of  f  . 

Since  is  a  polynomial  of  degree  q  ,  (1.1)  is  a  linear  equation 


in  x  ,  .  If 

n+q+1 

f[x  , . .  .,x  .  ]  ^  0  , 

n7  7  n+qJ  r  7 

then  Lemma  3.1  shows  that  the  unique  solution  is 


(1.2) 


n+q+1 


A 


n+i 


flx^i,  ...,xn 

f[V**'7W  J  ' 


(1.3) 


and  thi'j  might  be  used  as  an  alternative  definition.  From  Section  4  on, 
our  assumptions  ensure  that  xn^  ,,,^xn+q  are  sufficiently  close  to  a 
simple  zero  £  of  f  (*!"-*-)  f  so  (1.2)  holds.  In  Section  3>  the  assumption 
that  f^(£)  /  0  is  unnecessary:  all  that  is  required  is  that  xn+q+1 
is  a  (not  necessarily  unique)  solution  of  (1.1) . 

The  cases  of  most  practical  interest  are  q  =  1,  2  and  3-  For  q  =  1  , 
our  successive  interpolation  process  reduces  to  the  familiar  method  of 
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successive  linear  interpolation  for  finding  a  zero  of  f  ,  and  some  of  our 
results  are  well-known  (see  Collatz  (1964),  Householder  (I97I),  Ortega  and 
Rheinboldt  (1970),  Ostrowski  (1966),  Schroder  (1870),  Traub  (1964,  1967) 
etc.)*  For  q  =  2  ,  we  have  a  process  of  successive  parabolic  interpolation 
for  finding  a  turning  point,  and,  for  q  =  3  ,  a  process  for  finding  an 
inflexion  point.  These  two  cases  are  discussed  separately  by  Jarratt  (1967, 
1968),  who  assumes  that  f  is  analytic  near  £  .  By  using  (1.3)  and 
Theorem  2.5*1,  we  show  that  much  milder  assumptions  on  the  smoothness  of  f 
suffice  (see  Theorems  4.1,  5.1  and  7*1)*  Also,  most  of  our  results  hold 
for  any  q  >  1  ,  and  the  proofs  are  no  more  difficult  than  those  for  the 
special  cases  q  =  2  and  q  =  3  . 

Some  simplifying  assumptions 

Practical  algorithms  for  finding  zeros  and  extrema,  using  the  results 
of  this  chapter,  are  discussed  in  Chapters  4  and  5.  Until  then  we  ignore 
the  problem  of  rounding  errors,  and  usually  suppose  that  the  initial 
approximations  xQ,  ...,x^  are  sufficiently  good. 

For  the  sake  of  simplicity,  we  assume  that  any  q+1  consecutive 
points  xn>’**>xn+q  are  distinct.  (This  is  always  true  in  the  applications 
described  in  Chapters  4  and  5*)  Thus,  Pn  is  just  the  Lagrange 
interpolation  polynomial,  and  the  results  of  Chapter  2  are  applicable. 

As  in  Chapter  2,  the  assumption  of  distinct  points  is  not  necessary,  and 
the  same  results  hold  without  this  assumption  if  Pr  is  the  appropriate 
Hermite  interpolation  polynomial. 


'  i_ 
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A  preview  of  the  results 

The  definition  of  ’’order  of  convergence”  is  discussed  in  Section  2, 
and  in  Section  3  we  show  +liat,  if  a  sequence  (xn)  satisfies  (l.l)  and 
converges  to  {  ,  then  f^”^(£)  =  0  (Theorem  3*1)  • 

In  Sections  4  to  7>  we  consider  the  rate  of  convergence  to  a  simple 
zero  £  f^"^  y  making  increasingly  stronger  assumptions  about  the 

smoothness  of  f  .  For  practical  applications,  the  most  important  result 
is  probably  Theorem  4.1,  which  shows  that  convergence  is  superlinear  if 
fcC*1  and  the  starting  values  are  sufficiently  good.  As  in  similar  results 
for  Newton's  method  (Collatz  (1964),  Kantorovich  and  Akilov  (1959), 

Ortega  (1968),  Ortega  and  Rheinboldt  (1970)  etc.),  it  is  possible  to  say 
precisely  what  "sufficiently  good”  means.  Theorem  5*1  is  an  easy 
consequence  of  Theorem  4.1  and  the  theory  of  linear  difference  equations 
(Norlund  (1954)),  and  gives  a  lower  bound  on  the  order  of  convergence  if 
f^  is  Lipschitz  continuous. 

The  question  of  when  the  order  of  convergence  is  equal  to  the  lower 
bound  given  by  Theorem  5.1  is  the  subject  of  Sections  6  and  7.  Although 
the  results  are  interesting,  they  are  not  of  much  practical  importance, 
for  in  practical  problems  it  is  merely  a  pleasant  surprise  if  the  iterative 
process  converges  faster  than  expected'.  Thus,  the  reader  whose  main 
interest  is  practical  applications  might  prefer  to  skip  Sections  6  and  7 
(and  also  Theorem  3-l)>  except  for  Lemma  6.1. 

In  Section  8,  we  consider  the  interesting  problem  of  accelerating  the 
rate  of  convergence,  and  Theorem  8.1  shows  how  this  may  be  done.  We  make 
use  of  Lemma  6.1,  which  gives  a  recurrence  relation  for  the  error  in 
successive  approximations  to  (  ,  and  is  a  generalization  of  results  of 
Ostrowski  (1966)  and  Jarratt  (1967,  1968) . 
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Finally,  in  Section  9  the  theoretical  results  are  illustrated  by 
some  numerical  examples,  and  a  brief  summary  of  the  main  theorems  is 
given  in  Section  10.  The  reader  may  find  it  worthwhile  to  glance  at 
this  summary  occasionally  in  order  to  see  the  pattern  of  the  lesults. 


2.  The  definition  of  order 


Suppose  that  lim  x  =  £  .  There  are  many  reasonable  definitions 
n  —cd 

of  the  "order  of  convergence"  of  the  sequence  (x^)  .  For  example,  we 
could  say  that  the  order  of  convergence  is  p  if  any  one  of  (2.1)  to  (2.4) 
holds : 


lim 

n  -*oo 


K  >  0 


(2.1) 


lQg|Vi  -  SI 

log|xn  -  p 


(2.2) 


lim(-log|xn  -  Si)1/11  =  p  ,  (2.5) 

n  -*oo 

lim  inf(-log|xn  -  5 1)1/"  =  p  .  (2.4) 

n  -*oo 

These  conditions  are  in  decreasing  order  of  strength,  i.e., 

(2.1)  3  (2.2)  3  (2.5)  3  (2.4),  and  none  of  them  are  equivalent.  (2.1)  is 
used  by  Ostrowski  (19 66),  Jarratt  (1967)  and  Traub  (1964,  1967),  while 

(2.2)  is  used  by  Wall  (1956),  Tornheim  (1964)  and  Jarratt  (1968).  Voi&l  (1969) 
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f  >  f  !  f  1 


I  (  I  i 


1 


i 


3.2 


and  Ortega  and  Rheinboldt  (1970)  give  some  more  possibilities  (for 
example,  we  may  take  the  supremum  of  p  such  that  the  limit  K  in  (2.1) 
exists  and  is  zero,  or  the  infimum  of  p  such  that  K  is  infinite) .  See 
also  Schroder  (1870) .  For  our  purposes  it  is  convenient  to  use  (2.1)  and 

(2.4) ,  so  we  make  the  following  definitions. 

Definition  2.1 

We  say  x  -♦  £  with  strong  order  p  and  asymptotic  constant  K 
if  x  —  £  as  n  -♦  cd  and  (2.1)  holds. 

We  say  x  -*  £  with  wef  order  p  if  x^  -»  £  as  n  -»  00  and 

(2.4)  holds,  (if  =  l  for  all  sufficiently  large  n  then  we  say 

that  x  -*  £  with  weak  order  ®  .) 

Definition  2.2 
Let 

c  =  lim  sup  |xn  -  £|1/n  .  (2o) 

n  -*  co 

We  say  x^  —  £  sublincarly  (or  less  than  linearly)  if  x^  -»  £  and 

c  =  1  .  We  say  x^  -•  £  linearly  if  0  <  c  <  1  .  We  say  -  £ 

superlin early  if  c  =  0  .  We  say  xn  C  strictly  superlin early  if 

x  -♦  £  with  weak  order  p  >  1  . 
n  3  K 

Examples 

Some  remarks  and  examples  may  help  to  clarify  the  definitions.  If 


p  >  1 

and  xn  =  exp(-pn)(x+  o(l)) 

as 

n  -«  00  ,  then  x  - 
n 

0  with  strong 

order 

p  and  asymptotic  constant 

1  . 

If 

a  >  1  and  x 

n 

=  exp(-an) (2  t-  (-1) 

then 

x  -»  0  with  weak  order  <j  , 

V"!  W  7 

but 

not 

with  any  strong 

order,  for  the 
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I _ J  l-  J  I - 1 


I _ I  S - 1 


J 


5-  i - !  L  j 
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limit  in  (2.1)  does  not  exist  if  p  =  a  >  is  zero  if  p  <  <3  ,  and  is 

infinite  if  p  >  a  .  Thus,  convergence  with  strong  order  p  implies 

convergence  with  weak  order  p  ,  but  not  conversely. 

If  the  limit  in  (2.1)  or  (2.4)  exists,  and  x  -♦  £  ,  then  p  >  1  . 

If  the  limit  (2.1)  exists  with  p  =  1  ,  and  x^  -+  £  ,  then  K  <  1 

(K  v  1  for  linear  convergence,  and  K  =  1  for  sublinear  convergence) . 

Examples  of  sublinear,  linear,  superlinear,  and  strictly  superlinear 

1  -n  -2n 

convergence  are  x^  =  —  ,  2  ,  n  “  ,  and  2  respectively. 


3.  Convergence  to  a  zero 

In  this  *  n  we  show  that,  if  the  sequence  (x^)  defined  by  (1.1) 

converges,  tl  it  must  converge  to  a  zero  of  f^-1^  ,  assuming  only 

that  f  eC^~  l.a,b]  .  First,  we  need  a  lemma  which  gives  a  relation 

between  the  points  x  ,...,x  ,  ..  . 

*  n  ’  n+q+1 

Lemma  3.1 

If  x  ,x  _,..., x  ,  are  (distinct)  points  in  [a,b]  ,  and  x  .  . 

n  n+1  n+q  v  '  r  n+q+1 

satisfies  (1.1),  then 


q-1 


(3  (x  ,  ~x  . ) )  f[x  ,...,x  ]  —  f[x,...,x  -.  ]  . 

i-0  n+*  n+3+J-  n  n+q  n  n+q-1 


(3.1) 


Proof 


By  the  identity  (2.2.6), 


P  (x)  =  f[x  ]  +  (x-x  )  f[x  .x  ]  r  . . . 
nv  '  n  v  n7  n  n+1 


+  ( X- x  )  ...(x-x  )f[x  ,  •  •  • , x  ]  , 

v  n  n+q-1  n  r,+  q 


(3-2) 


so 
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p(9.-l)(x)  _  (q_i)i{f[x  ,  ...,x  ] 

n  v  7  7  1  n  n+q-1 


q-1 


-  (  )  (x  -  x))f[x  , . .  .,x  ]  } 

K  v  n+ 1  7 7  n  n+q  J 


Thus,  the  result  follows  from  (1.1) . 


Theorem  3.1 

Suppose  that  f  eC^  ^[a,b]  ;  that  a  sequence  (x^)  satisfying 

(1.1)  is  defined  (see  Section  1)  in  [a,b]  ;  and  that  there  exists 

lim  xn  =  5  .  Then  f^"1^)  =  0  . 
n  -*<» 


Proof 

Suppose,  by  vay  of  contradiction,  that 

For  0  <  r  <  q  ,  the  identity  (2.2.4)  shews  that 


(x  ^  -  x  ^  )f[x  ,  . . . ,  x  ]  =  f[x  , . .  .,x  .  ,]  - 

n+r  n+q7  n7  7  n+q  n  7  n+q-1 


f[x  , . . .  ,x 
n 


n+r-17  n+r+1 


7  **'7Xn+q] 


Thus,  from  Lemma  3* 1 , 


^n, r  ^Xn+i  Xn+q+l^  7 


Xn+r  Xn+q 


where 


|i  =  1  - 

n,  r 


f[x  ,...,x  ,  ,  ,x_^  •  •  -jXl 

n  n+r-1  n+r+1  n+q 

f  ( x  ,  . .  • ,  x  _] 

n  n+q-1 


Both  divided  differences  in  (3-7)  tend  to  f^  "^(£)/(q-l)!  as  n  - 
so  there  is  no  loss  of  generality  in  assuming  that  the  denominator 


(3-3) 


(3.4) 

•  (3-5) 

(3.6) 

(3-7) 
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f[xR, . . *#xn+^  is  nonzero  for  all  n  (on  the  assumption  (3.4)), 
and  we  have 


lim  p  =0 
n,  r 

n  -*a> 


(3-8) 


Summing  (3.6)  over  r  =  0, ...,q-l  and  rearranging  terras  gives 
)  (x  .  -x  .  ,.)  =  p»(x_  -x  ,  ._)  , 

L  m  '  n+t*  n4-r*«4-  I  7  '  rU-n  r>4-n4-  1  7  7 


where 


r=0 


n+r  n+q+17  nv  n+q  n+q+17 


p*  = 
n 


i-f- 

n,r 

r=0  7 


and,  by  (3*8),  there  is  no  loss  of  generality  in  assuming  that  the 
denominator  in  (3.10)  is  nonzero  for  all  n  >  0  .  From  (3.6),  with 
r  =  q-1  ,  and  (3«9)>  we  have 


(3-9) 


(3-10) 


X.  -  X  ,  =  p  (x 

n+q-1  n--q  nv  i 


n+q  Xn+q+l^  * 


where 


p  =  p  ,p' 
n  n, q-1  n 


(3.11) 


(3-12) 


The  repeated  application  of  (3-H)  gives 


x  i  —  x  =  pji.  ...p  (x  J  -x  ,  L.)  , 

q-1  q  0  1  nv  n+q  n+q+17  7 


(3.13) 


and,  by  (3-8),  (3*10)  and  (3.12),  pn  0  as  n  -*  «  ,  so  the  right 
side  of  (3*13)  tends  to  zero  as  n  -*  ®  .  This  contradicts  the  assumption 
that  x  i  x^  ,  so  (3.4)  must  be  false,  and  the  proof  is  complete,  (if 
we  do  not  wish  to  assume  that  any  q+1  consecutive  points  x^, ...,xn+^ 
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are  distinct,  then  we  may  argue  as  follows:  on  the  assumption  (3*4), 

the  right  side  of  (3*1)  is  nonzero  for  all  sufficiently  large  n  ,  and 

thus  at  least  two  consecutive  points  from  x  ,  are  distinct. 

n  n+q+1 

Taking  these  two  points  in  place  of  ^  and  x^  ,  we  get  a  contradiction 
in  the  same  way  as  from  (3-13)  •) 


4.  Super  linear  convergence 

If  f  has  one  more  continuous  derivative  than  required  in 
Theorem  3-l>  then  Theorem  4.1  shows  that  convergence  to  a  simple  zero 
of  f^q  is  superlinear,  in  the  sense  of  Definition  2.2,  provided  the 
starting  values  are  sufficiently  good.  The  theorem  makes  precise  what 
we  mean  by  "sufficiently  good",  (in  equation  (4.1),  w  is  the  modulus 
of  continuity:  see  Section  2.2.)  Convergence  to  a  multiple  zero  of 
f^-1)  is  not  usually  superlinear,  even  if  q  =  1  (see  Section  4.2), 
and  Theorem  3-1  above  is  the  only  theorem  in  this  chapter  for  which  we 
do  not  need  to  assume  that  the  zero  is  simple.  Thus,  there  is  no  reason 
to  expect  that  the  algorithms  described  in  Chapters  4  and  5  will  converge 
any  faster  than  linearly  to  multiple  zeros  of  f^q  ^  . 

Theorem  4.1 

Suppose  that  feCq[a,b]  ;  £e[a,b]  ;  xQ,  ...,x^  are  (distinct) 

points  in  [a,b]  ;  6  =  max  |x.-£|  ;  f^-1)(^)  =  o  ; 

U  i=0,  ...,q  1 

[£  -  8q,5  +  6q]  c  [a,b]  ;  and 

3w(f(q);60)  <  |frq)($)| 
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where 


|R2|  <  w(f(q);60)/|f^(0)|  =  Xq/3  <  1/3  , 


.(q) 


(4.9) 


so 


lR3^  |l+  Rp|  -  V2  <  1//2 


(4.10) 


(Note  that  the  assumption  (4.1)  ensures  that  f[xQ, ...,x^]  /  0  .) 

Fran  (4.5),  (4.8),  and  Lemma  3-1  (with  xQ  and  x^  interchanged), 

.  <-> 


where 


*i>  ^TM+V1+r5)  • 

1=1  n 


(U.I2) 


Frau  (U.6),  (I.7)  and  (It. 10),  equation  (U.12)  gives 

I  I  X0&,lf(q)(°)  I  ,  36'w(f(q);5«) 

W  -  2. (q-1)  l  2.(q-l)!  * 


(4-13) 


so,  from  (4.3)  and  (4.7),  \ 


^6'|f(l)(0)| 

IM  <  - 


(4.14) 


Now,  from  (4.11),  we  have 


I -  V5' 


(•*.15) 


By  the  assumption  (4.1),  <  1  ,  so  xq+1  lies  in  [ a, b j  ,  5^  and  >_1 


bl  =  b'  5  6q  >  X-j_  <  \1  >  and 


4o 


are  well-defined, 


(4.16) 


Xq+l'  5  ^051  ' 


In  the  same  way,  we  see  that  8^  >£>2-^2- 
1  >  4q  -  X1  -  X2  -  '  ’  *  >  and,  for  n  >  0  , 


Xn+q+J  -  XnS 


n  n+1 


(4.17) 


Thus,  the  inequality  (4.4)  holds,  and  it  only  remains  to  show  that 
x^  -*  0  superlinearly.  From  (4.4)  and  the  above, 


5kq+l  -  X0Xq  'X(k-l)q5l  -  X0  51  ' 

and  <  1  by  assumption  (4.1),  so  &n  -»  0  as  n  -*  m 
the  continuity  of  f^  and  the  definition  (4.3), 

Take  any  e  >  0  .  For  all  sufficiently  large  n  , 


(4.18) 


Thus,  by 


-♦  0  as  n  -» 00  . 


<  E 


(4.19) 


so,  from  (4.4), 


lim  sup  81///n  <  e 

n  — 

n  -*oo 


(4.20) 


As  e  is  arbitrarily  small,  this  shows  that 


lim  |  x  !1//n  =  lim  =  0 

1  n 1  n 


(4.21) 


Thus,  x^  —  £  =  0  superlinearly,  and  the  proof  is  complete. 


Remarks 

The  proof  of  Theorem  4.1  shows  that,  for  n  >  0  ,  lXn+q+l~^ 
no  greater  than  the  second-largest  of  -  £|, . . .,  jx^+  -£|  .  Thus,  if 


f  *  r  1  1 
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>) 


) 


) 


q  =  1  ,  the  sequence  (jx  -£|)  is  monotonic  decreasing,  except  perhaps 
for  the  first  term.  In  fact,  the  proof  shows  that,  for  q  =  1  and 


0  as  n  * 


(4.22) 


(provided  x  /  £)  .  This  is  a  common  definition  of  "superlinear 
convergence”,  stronger  than  our  Definition  2.2. 

If  q  >  2  ,  the  sequence  (|x  -  £|)  need  not  he  eventually 
monotonic  decreasing:  rconotonicity  would  follow  from  strong  superlinear 
convergence  with  order  greater  than  1  ,  but  more  conditions  are  necessary 
to  ensure  this  sort  of  convergence  (see  Sections  6  and  7)* 
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Since  the  case  a  =  1  often  occurs,  we  write  sinmly  8  for 


n  >  “d  y  for  y 


<1,1  ‘ 


Remarks 

8  is  just  the  positive  real  root  of  (5*1),  and  it  is  easy  to 
q,u 

see  that,  for  0  <  a  <  1  , 


(l+a)2q+1  <  <  ( l+a)q 


(5-2) 


We  are  only  interested  in  the  constants  y  when  ot  =  1  .  If 

q,Q! 

a  =  1  and  q  >  2  then  there  are  exactly  two  complex  conjugate  roots 
of  (5.1)  with  modulus  y  .  If  q  =  1  or  2  then  y  <  1  ,  but,  for 

q  >  5  , 

1  <  7  <  8 

q  q 

This  may  be  proved  by  applying  the  Lehmer-Schur  test  to  show  that,  for 
suitable  e  >  0  ,  exactly  q-2  roots  of 

xq+1  =  x  +  1 


(5-5) 


lie  in  the  circle  |x|  <  1+  e  .  The  details  are  emitted,  for  all  cases 
of  practical  interest  are  covered  by  Table  5*1,  which  gives  8q  and  7q 
to  12  decimal  places  for  q  =  1,  ...,10  •  The  table  was  computed  by 
finding  all  roots  of  (5*5)  with  the  program  of  Jenkins  (1969),  and  the 
entries  are  the  correctly  rounded  values  of  B  and  7  if  Jenkin*s 

q  q 

a  posteriori  error  bounds  are  correct. 
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Table  5.1:  The  constants  3^  and  7^  for  q  =  1(1)10  to  12D 


(~ 

1 

q 

7q 

-  ■ 

. . — 

- - - - - - - 

1 

1.618033988750 

0.618033988750 

t 

2 

1 

1.324717957245 

0.868836961833 

!  5 

1.22071+1+084606 

1.063336938821 

4 

1.167303978261 

1.099000315146 

5 

1.134724138402 

1.099174913506 

6 

1.112775684279 

1.091953305766 

7 

1.096981557799 

1.083743696285 

8 

1.085070245491 

1.076133134033 

9 

1.075766066087 

1.069448852721 

10 

1.068297188921 

1.063666938404 

See  Definition  5.1  and  the  remarks  above  for  a  description  of 

the  constants  8  and  7 
q  q 
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Theorem  5«1 

Suppose  that  f  e  LC^[a,b;M,a]  (see  Section  2.2);  £  e  (a,b)  ; 

f(q  ^(£)  =  0  ;  and  f^(£)  /  0  .  If  x,y  ...,x^  are  (distinct  and) 
sufficiently  close  to  £  ,  then  a  sequence  (x  )  is  uniquely  defined 
by  (1.1),  and  x  -»  £  with  weak  order  at  least  6  ,  the  positive 

real  root  of  x^+^  =  x+a  . 


Remark 


If  8_  =  max  x.  -£  ,  then,  from  Theorem  4.1,  x 

0  1  i  *'  ( 

i=0, . .  .,q 


0" •‘,,Xq 


are  "sufficiently  close"  to  £  if  5^  <  £  -  a  ,  8Q  <  b  -  £  ,  and 


3M5q  <  |fW(£)|  • 

If  these  conditions  are  satisfied,  then  an  upper  bound  on  |x  -  £| 
follows  from  equation  (5.10)  below. 

Proof  of  Theorem  5«-T- 
For  n  >  0  ,  let 


»(q) 


n 


(5-4) 


5  =  max  x  -  £ 

n  .  _  1  n+i  * 

i=0, . .  .,q 


(5-5) 


Suppose  that  x^,  ...,x  are  so  close  to  £  that  the  conditions 


m 


entioned  in  the  remark  noove  are  satisfied.  Then  Theorem  4.1  shows 


that  (6  )  is  monotonic  decreasing  to  zero,  and 


5n+q+l  - 


5M 

|f(l)(0| 


ba  b 
n  n+1 


(5.6) 


If  eventually  8^  =  0  ,  then  the  result  follows  i.jnediately :  by 
our  definition,  x^  -»  £  with  weak  order  <=  .  Hence,  suppose  that 
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6^/0  for  all  n  >  0  (and  thus,  from  (5*6)',  M  >  0  ).  Let 


I  3M  1/a, 
■  'log  8-  > 


(5.7) 


(not  the  same  Xn  as  in  Theorem  4.1).  From  condition  (5*4)  and  the  fact 


that  (5n)  is  monotonic  decreasing,  0  <  Xq  <  X^  <  Xg  <  •  •  •  >  and? 


equation  /'•  6), 


from 


xn+q+1  >  Vltaln 


(5-8) 


Since  8  ^  >  1  ,  we  have 

q,Qt 


*»  *  \>Ca 


(5*9) 


for  n  =  0,  ...,q  .  Thus,  from  (5*8)  and  the  definition  of  (3  ,  the 

q,Q! 

* 

inequality  (5*9)  holds  for  all  n  >  0  ,  by  induction  on  n  .  Hence,  for 
all  n  >  0  , 


-log  |xn  -  5 1  >  -log  8n  >  10  *  i  log  | 


fw(t) 


(5-10) 


Since  Xq  >  0  and  a  >  1  ,  equation  ( 5 .-10)  shows  that 

lira  inf  (-log  |xn  -  Si)1/11  >  P  , 

n  -*  oo 


(5*11) 


which  caapletes  the  proof. 

Note  that,  in  the  important  case  a  =  1  ,  there  is  a  simple  proof  of 
Theorem  5.1  which  does  not  depend  on  Theorems  2.5.1  and  4.1.  Also,  this 
proof  shows  that,  instead  of  (5.4),  the  condition 


3MS0  <  2|f(q)(C)| 


(5.12) 
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is  sufficient.  The  iiea  is  this:  by  applying  Rolle’s  Theorem  q-1 

times,  we  see  that  (x)  coincides  with  f  at  points  |  and  f ' 

n  n  n 

say,  with  (§  -  £|  <  &n  and  |  ^  -  £|  <  5^  =  the  second  largest  of 
lxn  '  5 1*  •••»  lXn+q  '  5 1  •  Thus,  from  Lemma  2.4.1, 

|p(q-l)(C)|  £  1^5,  .  (5.1J) 


On  the  other  hand,  equations  (1.1)  and  (3 *3)  show  that 


n+q+1 


=  5  - 


p(q-i)(£) 

qlf[x  ,  . .  .,x  ,  ] 
^  n  n+q 


(5-14) 


so  we  can  bound  x 

1  n-i-q-ri 

same  way  as  above. 


and  then  the  result  follows  in  much  the 


6.  The  exact  order  of  convergence 


Theorem  5*1  gives  conditions  under  which  xn  -»  £  with  weak  order  at 
least  0^  .  It  is  natural  to  ask  if  the  order  is  exactly  0^  .  In  general, 
this  is  true,  but  some  conditions  are  necessary  to  ensure  that  the  rate 
of  convergence  is  not  too  fast:  for  example,  the  successive  linear 
interpolation  process  (q  =  1)  converges  to  a  simple  zero  £  with  weak 
order  at  least  2  (>  0n  =  1.6l8  ...)  if  it  happens  that  f"(£)  =  0  ,  for 

then  linear  interpolation  is  more  accurate  than  would  normally  be  expected. 
Theorem  6,1  gives  sufficient  conditions  for  the  order  to  be  exactly  0^  . 
Apart  from  the  condition  f^+^(£)  /  0  ,  it  is  necessary  to  impose  some 
conditions  on  the  initial  points  xQ, ...,x^.  (These  extra  conditions  are 
superfluous  if  q  =  1  :  see  Section  7  •) 
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Before  proving  Theorem  6.1,  we  need  two  lemmas.  lemma  6.2  is 
concerned  with  the  solution  of  a  certain  difference  equation;  and  is 
closely  related  to  Theorem  12. I  of  Ostrowski  (1966).  (The  lemma  co\ild 
easily  be  generalized,  but  we  only  need  the  result  stat  d.)  Lemma  6.1 
gives  a  recurrence  relation  for  the  error  -  £  .  Special  cases  of  this 
lemma  have  been  given  by  Ostrowski  (1966)  and  Jarratt  (1967,  1968) . 
Ostrovski  essentially  gives  the  case  q  =  1  ,  and  Jarratt  gives  weaker 
results  for  q  =  2  and  q  =  3.  (our  bound  on  the  remainder  R^  is 
sharper  than  Jarratt' s,  and  we  do  not  assume  that  f  is  analytic.)  In 
Section  8,  we  show  how  the  result  of  Lemma  6.1  may  be  used  to  accelerate 
convergence  of  the  sequence  (xn)  . 


Lemma  6.1 

Suppose  that  f€Cq+1[a,b]  ;  £e[a,b]  ;  =  0  ; 

f^(0  /  0  J  XnJ  "*,Xn+q  are  (^s^nc"t)  Poin6s  in  [a,b]  ;  and 
xnfq+l  sa6isfies  equation  (1.1).  Let  6^  be  the  largest  of 
1x^-5  |, . . .,  1 3Cn-*.q_C  I  >  6^  the  second  largest.  Then 


Xn+q+l  ^ 


q(?*-l)f(q)(5) 


I 


0  <i<j  <q 


(x  -  £)(x  ,  •  -  i)  +  R 
v  n+i  '  n+j  n 


(6.1) 


where 


n 


=  0(5  6' [6  +w(f^q+1^  ;6  )  ]) 
v  n  n  n  v  ’  ny 


(6.2) 


as  8  -  0  . 

n 


Proof 

Without  loss  of  generality,  assume  that  n  =  0  and  £  =  0  .  Rearrange 
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x0, ,  if  necessary,  so  that  |xQj  <  |x1|  <  ...  <  |x^|  .  From 
Lemma  3.1, 


q.xq+if[x0,  ...,Xq]  =  (  ^  xi)f[xQ,  ...,Xq]-f[xQ,  ...,x  ] 

i~0  , 

Thus,  as  =  0  j=  f^q^(0)  ,  Theorem  2.5*1  gives 

q.x  +1  (1+r  ) 

^  q+1  qi  v  V 

=  %  Xi  +  (L  X±\  (q+l)l  +  r2 


(6.3) 


i=0 


q-i  f(q)/0\ 

V21* 


(  y  x  x  >  i(^1m  +  r ) 

0<i<J<q  'I'  5>' 

(6.4) 


where 


w(f(q) ;6_) 


(0)1 


-  °(8o)  > 


(6.5) 


lr2l  5  60w(f(q+l) ;60)/qI  =  0(60w(f^q+1^ ;6q))  ,  (6.6) 


and. 


g*^w/f>(q.+^-)  .gt\ 

N  5  °  2(o-l)  1  ’  °  -  ity) 


(6.7) 


as  -»  0 


The  right  side  of  (6.4)  is  Just 


0  <i < j <q 
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(6.8) 


I . ~JL 


A  A 


I 


I 

I 

I 


i 

i 

i 

i 


j 

s 

r 


* 
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as  n  -*  od  . 


X 

n 


If  0  <  s  <  7  then 

q 

c.0n  +  0(nV.7n) 

q  q 


as  n  -»  co  ,  where 


Co  «  ,  - 1  . 

\  1  if  q  >  1  , 


and  c  is  a  nonnegative  constant. 


(6.15) 


(6.16) 


Proof 

The  restriction  |u  |  <1  in  Theorem  12.1  of  Ostrowski  (1966)  is 
unnecessary,  for  we  can  choose  any  X  with  |u^  {  <  X  <  |u  J  and 
consider  kn/\n  >  instead  of  X ^  ,  in  Ostrowski1  s  proof.  Thus,  in  view 
of  the  remarks  after  Definition  5.1,  (6.1^)  and  (6.15)  follow  from 

Ostrowski' s  Theorem  12.1.  (6.14)  does  not  follow  directly  in  the  same 

way,  but  the  proof  of  Ostrowski* s  Theorem  12.1  goes  through,  assuming 

=  o(sn)  instead  of  kn  =  0(sn)  ,  and  giving  a  result  from  which  (6.14) 
follows.  The  only  difficulty  is  in  proving  the  modified  form  of 
Ostrowski' s  Lemma  12.1,  but  this  follows  from  the  Toeplitz  lemma:  if 
k  -*  0  ,  1 1 1  <  1  ,  and  z  =  k+  k  .£+...+  k~$n  ,  then  z  -  0  as 

n  -  00  (see  Ortega  and  Rheinboldt  (1970),  pg.  399)* 

Theorem  6.1 

Suppose  f  eCq+1[a,b]  ;  £  e  (a,b)  ;  f^q-1^(£)  =  0  ;  f^(£)  /  0  ; 

and  f(q+1)(£)  /  0  .  If  |xq~^|  is  sufficiently  small, 

lxi_!  -  51  >  Ulxi  ~  5| 
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for  i  =  1,  2,  . . . ,  q  ,  and 

!xq-C|  >  6|k(xq  -  C)^-  5)  |  >  0  ,  (6.18) 

where 

K  = 

then  a  sequence  (xn)  is  uniquely  defined  by  (1.1),  and  x^  -  £  with 

weak  order  exactly  0  .  In  fact,  if  q  =  1  or  2  then  x  -»  £  with 

q  0  -1  n 
strong  order  0^  and  asymTtotic  constant  |k|  q  ,  and  if  q  >  3  then 

-log|xn-£|  =  c.0^+O(n.7q)  (6.20) 

as  n  —  *  ,  for  some  positive  constant  c  . 

Remarks 

Condition  (6.17)  ensures  that  x^, ...,x^  approach  £  sufficiently 
fast,  while  (6.18)  makes  sure  that  they  do  not  approach  £  too  fast. 

These  conditions  could  be  weakened,  but  Theorem  7*1  shows  that  some  such 
conditions  are  necessary  if  q  >  2  .  If  q  =  1  then  the  conditions 
are  superfluous :  see  Corollary  7*1* 

Equation  (6.20)  implies  that  (2.2)  holds  with  p  =  0^  ,  but  (2.1) 
does  not  necessarily  hold,  for  7^  >  1  if  q  >  3  . 

Proof  of  Theorem  6.1 

Let  yn  =  jK(xn  -  S)|  •  (6.21) 

From  the  assumptions  (6.17)  and  (6.18)  we  have,  at  lenst  for  n  =  0  , 


»(q+i), 


q(q+l)f(q)(C) 


(6.19) 
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Vi-1  ^  Vi 


(6.22) 


for  i  =  1, 2,  . . . ,  q  ,  and 


v,  i  Vrt  >  0 


(6.23) 


We  shall  show  that  (6.22)  and  (6.23)  hold  for  all  n  >  0  •  Suppose,  as 
inductive  hypothesis,  that  they  hold  for  all  n  <  m  .  Then,  by  taking 
|Xq-£|  sufficiently  small  (independent  of  m)  ,  we  may  suppose  that  ■'he 


remainder  R  of  Lemma  6.1  satisfies 
n 


lmnl  ^  S'Al 


(6.24) 


for  all  n  <  m  .  Thus,  fran  Lemma  6.1, 


^m+q+l 


^  +  h +  75 *  nr +  ••• + 


-  2  ymym+l 


(6.25) 


From  (6.23)  with  n  =  m  ,  this  gives 


ym+q  —  ^ym+q+l 


(6.26) 


Similarly, 


Wl  -  yrVl(1 


1 

5 


42  1? 


i\ 

±y 


-  2  ymymH 


*  W#2 


(6.27) 

(6.28) 


Also,  from  (6.27) >  >  0  ,  so  the  right  side  of  (6.28)  is  positive. 
From  (6.26)  and  (6.28),  we  see  that  (6.22)  and  (6.23)  hold  for  n  =  m+1  , 
so  they  hold  for  all  n  >  0  ,  by  induction.  Thus  (6.25)  and  (6.27)  hold 
for  all  m  >  C  . 
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1  C 


Let 


X  =  -log  y 
n  n 


(6.29) 


k  =  X  ,  -  X  -  X 
n  rc+q+1  n+1  n 


(6.30) 


Fran  (6.25)  and  (6.27), 


|kn|  _  log  2  , 


:6.ji) 


so  we  may  apply  Lemma  6.2  with  s  =  1  .  If  q  >  3  then  7  >  1  ,  so 


X  =  c.Pn  +  0(n-7n) 
n  Kq  q 


(6.32) 


as  n  -♦  oo  .  From  Theorem  5-1,  c  >  0  ,  so  the  result  for  q  >  3  follows. 
If  q  -  1  or  2  then  7  <  1  ,  so 

q 


X  =  c.3R  +  0(1) 

n  Kq  v  ' 


(6.33) 


as  n  -♦  oo  .  From  (6.29),  (6.30),  (6.33)  and.  Lemma  6.1,  we  now  see  that 


kn  -  0(1) 


(6.J4) 


as  n  -»  os  ,  so,  by  equation  (6.l4)  with  s  = 


X  =  c.0n  +  o(l) 
a  q  v  ' 


(6.35) 


as  n  -♦  co  .  Thus,  there  exists 


Jr*l  _ 
lim  — —  =  1  , 

n  -00  pq 

y 

n 


(6.36) 


so  the  result  follows  fron:  equation  (6.21).  (Note  that,  if  r  Lip,,  a 

for  any  M  and  a  >  0  ,  then  (6.3*0  may  be  replaced  by  =  o(sn)  for 
any  s  >  0  ,  so  (6.15)  nolds,  and 


1 


1 


3*7 


jjgg  '  ^ 

K  -  5lP? 


as  n  -*  oe  .) 


6  — 1 
K|  1 


0(n“ 


1-1 


(6.37) 


7  •  Stronger  results  for  q  =  1  and  2 

In  this  section  we  restrict  our  attention  to  the  two  cases  of  the 
greatest  practical  interest,  q  =  1  (successive  linear  interpolation) 
and  q  =  2  (successive  parabolic  interpolation  for  finding  an  extreme 
point).  Corollary  7*1  shows  that  the  conditions  (6.17)  and  (6.l8)  of 
Theorem  6.1  are  unnecessary  if  q  =  1  . 

Corollary  7«1 

p 

Suppose  that  q  =  1  ;  f  eC  [a,b]  ;  £e  (a,b)  ;  f(£)  =  0  ; 

f’(£)  /  0  5  and  f"(£)  /  0  .  If  xQ  ,  x^  and  £  are  distinct  and 
sufficiently  close  together,  then  a  sequence  (x^)  is  uniquely  defined 
by  (1.1),  and  xr  -»  £  with  strong  order  ^  (1  +  /5)  and  asymptotic 

constant  |~2f » |  ^  as  n  -  »  . 

Proof 

From  Lemma  6.1, 

X2 " 5  =  Oo-OCxi-CX^oW)  (7 *1) 

as  max(|xQ-£|,  |x^-£j)  -*  0  .  Thus,  Theorem  6.1  is  applicable  to  the 
sequence  (x^)  ,  where  =  xn+^  >  provided  x^  and  x^  are  sufficiently 
close  to  £  . 
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Remarks 

Ostrowski  (19 66)  gives  Corollary  7*1  with  the  stronger  assumption 
that  f  eC  [a,b]  .  He  also  shows  that,  if  f  eC  [a,b]  and  the 
conditions  of  Corollary  7*1  are  satisfied,  then 


as  n  -♦  oo  .  As  we  remarked  at  the  end  of  the  proof  of  Theorem  6.1,  the 

2 

relation  (7*2)  holds  provided  that  f  e  LC  [a,b;M,a]  xor  some  M  and  Ct 
(see  equation  (6.37)).  For  an  even  weaker  condition,  see  (7*7)  and  (7*8) 
below. 

The  following  theorem  removes  the  rather  artificial  restrictions 
(6.17)  and  (6.18)  of  Theorem  6.1,  if  f^+^  is  Lipschitz  continuous 
and  q  =  1  or  2  .  The  proof  does  not  extend  to  q  >  3  ,  because  it 
depends  on  the  assumption  that  7^  <  1  ,  which  is  only  true  for  q  =  1 
and  q  =  2  (see  Table  5*1). 


Theorem  7*1 

Suppose  that  q  =  1  or  2  ;  f e LCq+1[a,b;M]  ;  (  t  (a,b)  ; 

f(^  ^(£)  =0  ;  and  fv<^(£)  /  0  .  If  x^, ...,x^  ere  (distinct  and) 
sufficiently  close  to  £  ,  then  a  sequence  (x^)  is  uniquely  defined 
by  (l.l),  and  either 


1: 


x 

n 


£  with  strong  order 


V1 

> 


(3^  and  asymptotic  constant 
in  fact 
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P<1  1  +  Ofr4'1  ,“)  (7.3) 

as  n  -♦  oo  (recall  that  0^  ~  1.6l8  ,  02  ~  1-325  ,  ~0*6l8  > 

and  7g~  O.869)  > 

or 

2:  x  -♦  t  with  weak  order  at  least  2  if  q  =  1  ,  or 

n 

1 

(1±3.)5  ~  1.378  if  q  -  2  . 


Remarks 

If  q  =  1  then,  by  Corollary  7«1>  case  2  of  Theorem  7.1  is 
possible  only  if  f"(£)  =0  (or  if  one  of  xQ  and  x1  coincides  with  £  , 
when  the  weak  order  is  a>  ) . 

If  q  =  2  then  case  2  is  possible,  although  unlikely,  even  if 
f^(£)  /  0  and  xn  /  ^  for  a11  n  •  A11  "that  is  necessary  is  that 
the  terms  in  relation  (7*28)  repeatedly  nearly  cancel  out.  Jarratt  ( 1967 ) 
and  Kowalik  and  Osborne  (1968)  assume  that  such  cancellation  will  eventually 
die  out,  so  the  order  will  be  02  .  The  conditions  (6.17)  and  (6.18) 
are  sufficient  for  this  to  be  true,  but  without  sane  such  conditions  there 
is  a  remote  possibility  that  cancellation  will  continue  indefinitely. 

7or  example,  with  f(x)  =  2x°+x  ,  there  are  starting  values  ,  x^ 

and  x2  such  that 


and 


X2n  ~  exP^"2n) 


X2n+1 


-exp(-?n) 


\  (7  A) 

J 
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so  x^  -♦  £  =  0  with  weak  order  /2  .  Similarly,  if 


7  =  §(3  +  /5)  =  2.618  , 


(7-5) 


then  there  are  starting  values  such  that 


x?n  ~  exp  (-7  )  , 

X3nfl  ~  exp(-(7-l)7n)  , 


3n+2 


-exp(~(7-l)7n+1)  , 


(7-6) 


l/3 

so  xn  -•  0  with  weak  order  7  '  =  1.378  ...  .  The  xroof  is  emitted, 

but  the  reader  may  easily  verify  that  (7«*0  and  (7.6)  are  compatible 
with  Lemma  7*3  below  (this  depends  on  the  relation  27-I  =  7(7-1))  . 

For  the  sake  of  simplicity,  we  have  not  stated  Theorem  7.1  in 
the  sharpest  possible  form.  If  f(q+1)(£)  -  0  ,  then  xr  -»  £  with 


weak  order  at  least  0^  1+a  >  0^  ,  provided  that  fv'1'~/  e  Lip^  a  for 
some  M  and  a  >  0  .  If  f^+^(5)  £  0  ,  then  the  theorem  holds 
provided  that  f  eC^+^ta,!]  .  Equation  (7*3)  may  no  longer  hold,  but  if 
there  is  an  e  >  0  such  that 


(q+1) 


>(q+i) 


w(f(q+1);5)  =  0(  jlog  5|~e^q) 

as  8  -*  0  ,  then 


(7.7) 


l*n.l  -  Cl 


X  -  C 

n  3 


>(1+1) 


q(q+l)f'<1'  {£) 


V1 


°(nq~17q) 

if 

e 

>  1 

0(nq7q) 

if 

e 

=  1 

neN 

°(r„  ) 

if 

e 

<  1 

as  n  -  od  .  (A  condition  like  (7*7)  occurs  in  some  variants  of  Jackson's 
theorem:  see  Meinardus  (1967).) 
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Before  proving  Theorem  7*1>  we  need  three  rather  technical  lemmas. 


Lemma  7*1 


Suppose  that,  for  n  >  0  , 

x  =  x  x  +  x  ,,x  +  x  x  +  m  &2&*  , 

n+3  n  n+1  n+1  n+2  n  n+2  n  n  n  ’ 


(7-9) 


where  8  is  the  largest  of  lx  I  ,  |x_^_ |  and  |x_^0|  ,  and  S'  is 

n  1  n[  1  Dri1  1  ttreZ 1  n 

the  second  largest.  If  there  is  a  positive  constant  L  such  that 

>  lx0l  >  5lxil  >  9|*2l  >  271*5!  •  611(1 


lmnl  <  L 


(7- ID) 


for  all  n  >  0  ,  then  Jx  |  >  3|x  |  for  all  n  >  0  . 


Proof 


As  in  the  proof  of  Theorem  6.1,  it  follows  by  induction  on  n  that 

Ivjl  2  §  Ivwl  2  f  IV1V2I  2  5lvitl  ’  (7-n) 


for  all  n  >  0  . 


Lemma  7*2 

If  the  conditions  of  Lemma  7*1  are  satisfied,  then  either  x^  =  0 
for  all  sufficiently  large  n  ,  or 

^4  -  X*  0(n7“) 


as  n  -»  ® 
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Proof 

If  xn  /  0  for  infinitely  many  n  then,  by  Lemma  7*1>  xn  /  ®  for 

all  n  >  0  .  If  this  is  so,  define  \  =  -log lx  |  and 

k  .  From  equation  (7 .11)*  k  is  bounded,  so 

n  n+3  n+1  n  v  77  n 

Lemma  6.2  with  s  =  1  gives  =  c0g  +  0(1)  as  n  -*  »  .  By 
Lemma  7*1>  \n  -*  +  ®  ,  so  c>0.  Thus,  from  (7»9)> 


kn  =  o(exp{-c(p2-i)^+1;; 


(7*12) 


as  n  -»  ®  (this  is  not  necessarily  true  in  the  proof  of  Theorem  6.1) . 
Now,  Lemma  6.2  with  s  <  7^  gives 

K  =  +  0(n7S) 


as  n  oo  ,  and  the  result  follows  from  the  definition  of  \ 


Lemma  7*5 

Suppose  that  (7*9)  and  (7*10)  hold.  Then  there  are  constants  K 
and  N  (depending  on  L)  such  that  if,  for  some  n  >  N  , 


(7-15) 


—  >  Jx  i  >  n|x 
n  -  1  n  -  1  n+21 


(7  >1*0 


—  >  |x . |  >  n|x  ^0| 
n  —  1  n+1'  —  1  n+2' 


(7-15) 


then 


=  x  x  ^  (1  +  v.,  )  , 

n  n+lv  l,ny  7 


x  x *(1  +  v„  )  +  x  ,.x  .-(l  +  v,  )  , 
n  n+lv  2,ny  n+1  n+2v  3,n'  7 


x  x^..  (1  +  v.  )  +  x  x  -X  ^0(1  +  v_  )  , 
n  n+lv  4 ,n'  n  n+1  n+2v  5,n7  7 


(7.16) 


(7.17) 


(7.18) 


%.  A*. 
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n.,n  ni2 
i+l 


f? .29; 


4nlxnVl 


,  <  2  x  x 

1  —  n  1 


n  n+11 


(7-30) 


ni+l 


(7.51) 


2 

lx  <  4n|x  x  , 

1  n+2  1  n  n+1 


(7.32) 


If  either  (7*30)  or  (7*32)  holds,  then  Lemma  7*3  is  applicable  for  all 
sufficiently  large  n  =  in  the  sequence  N  .  To  avoid  confusion 
with  subscripts,  write  m  for  ni+1  (so  m  =  n+2  or  n+3  ).  If 
n  =  rn  is  sufficiently  large,  and  (7-29)  and  (7*30)  hold,  then 


x  I  <  2|x  x  . , I 
m1  —  1  n  rt+11 


(7-33) 


and,  by  Lemma  7*3, 


lx  I  <  2  x  x  , .  | 
1  m+11  —  n  n+11 


(7.310 


If  (7«3l)  and  (7-32)  hold  then,  similarly, 


| x  |  <  2|x  x  . I 

1  m1  —  1  n  n+11 


(7-35) 


•  j 

x  , .  I  <  4|x  x"\ 
m+11  —  1  n  n+1 


(7-37.) 


yn  =  WlXnl  * 


(7-37^ 


Alter  a  fixed  n  -  n^  in  N  ,  suppose  that  the  next  r  >  1  elements 
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of  N  satisfy  (7*31),  and  then  the  next  s  >  1  .satisfy  (7.29).  Then 
repeated  use  of  the  inequalities  (7-33)  to  (7*36)  gives 

where 


»(r,«)  =  ♦  (^-2)(Lzll)r]  . 


Let 


t(r,s)  =cp(r,s) 


3r*2s 


(7.39) 


(7-40) 


For  fixed  s  >  1  ,  +(r, s)  is  a  decreasing  function  of  r  ,  with  limit 

(7.41) 


1 
rv-  3 


C  =  (?  t  =  inf  *(r,s) 

r,  s  >1 


as  r  -*  00  .  Thus,  x  -»  0  with  weak  order  at  least  c  ,  so  case  2  of 
’  n 

the  theorem  holds. 

Now  suppose  that  there  is  no  infinite  sequence  N  as  above.  By  the 
superlinear  convergence  of  (x^)  ,  Lemma  7*3  is  applicable  for  infinitely 
many  n  .  Choose  such  an  n  (sufficiently  large).  There  are  only 
three  possibilities: 


1.  F,quation  (7-30)  holds; 

2.  Equation  (7*32)  holds;  or 

3.  Neither  (7*30)  nor  (7*32)  holds,  so 


2  x  x 
1  n  n+11 


(7-42) 


In  the  first  case.  Lemma  7*3  shows  that  we  can  replace  n  by  n+2  ,  and 
continue  with  one  of  the  three  cases  (it  is  crucial  to  note  that  Lemma  7-3  is 
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still  applicable).  In  the  second  case,  Lemma  7-3  shows  that  we  can 
replace  n  by  n+J  and  continue.  Since  no  infinite  sequence  N  with 
the  above  properties  exists,  the  third  case  must  eventually  arise.  Then, 
from  (7-^2)  and  Lemma  7*3>  we  see  that  Lemma  7*2  is  applicable  to  the 
sequence  (x^)  ,  where  x^  =  •  By  Lemma  7*2,  (x^)  converges 

with  strong  order  and  asymptotic  constant  1 ,  and  hence,  so  does  (x^)  . 
In  view  of  the  assumption  (7 -27)*  this  completes  the  proof. 


8.  Accelerating  convergence 

If  a  very  accurate  solution  is  required,  and  high-precision  evaluations 
of  f  are  expensive,  then  it  may  be  worthwhile  to  try  to  increase  the 
order  of  convergence  of  the  successive  approximations  by  some  acceleration 
technique.  For  example,  we  can  use  Lemma  6.1  to  improve  the  current 
approximation  at  each  step  of  the  iterative  process.  Jarratt  (1967)  suggests 
one  way  of  doing  this  if  q  =  2  ,  but  the  method  which  we  are  about  to 
describe  seems  easier  to  justify  (see  Theorem  8.1),  and  applies  for 
any  q  >  1  . 

Suppose  that  xQ,  ...,x^+^  are  approximations  to  a  simple  zero  £ 

of  f^"1)  .  For  example,  they  could  be  the  last  q+2  approximations 

generated  by  the  successive  interpolation  process  discussed  above.  We 

may  define  Xq+2’Xq+y  *  *  *  in  following  way:  if  n  >  1  and 

xQ,  ...,xn+^  are  already  defined,  let  Pn  =  IP(f  ;x^,  . .  .,xn+<^)  ,  and 

choose  y  such  that 
n 


p(q-1) (y  )  =  0  , 

n  wn' 


(8.1) 
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.1 


i.e.,  is  just  the  next  approximation  generated  by  our  usual 

interpolation  process.  From  Lemma  3.1,  yn  is  given  explicitly  by 


i  _  f|Vi’ • 

q  Xn+i  f[x  , . 

^  i=l  n 


.,x  ^  ] 
n+cl  ) 

•  > x^.  ] 
rH-q 


(8.2) 


Instead  of  taking  yn  as  the  next  approximation  x^  ^  ,  we  use 
Lemma  6.1  to  compute  a  correction  to  y^  ,  and  take  the  corrected  value 
as  the  next  approximation.  Formally,  we  define  x^^^  by 


n+q+1 


f  [  x  i , . .  • ,  x  ] 

n-1  n+q_ 

yn  “  q.ffx  ,.. . , x  J 
^  n  n+q 


(8.3) 


where 


s  =  L  (x  .  .  -  y  )(x  -  y  ) 

n  0<i<j<q  nfl  n  V 


(S.b) 


For  a  justification  of  equations  (8.3)  and  (8.4),  see  the  proof  of  Theorem 
8.1  below.  This  theorem  shows  that,  under  suitable  conditions,  the 
sequence  (xr)  is  well-defined,  and  xn  -*  £  with  weak  order  appreciably 
greater  than  0^  ,  which  is  the  usual  order  of  convergence  of  the 
unaccelerated  process  (see  Sections  5  to  7)  •  Note  that  there  is  very 
little  extra  work  involved  in  computing  xn+^+^  from  equations  (8.3) 
and  (8.4)  if  yn  is  computed  via  (8.2),  for  f[xR, . .  ]  and 

f[xn  -j_]  (except  at  the  first  iteration)  will  already  be 

known. 

Before  stating  Theorem  8.1,  we  define  some  constants  0^  which 
take  the  place  of  the  constants  0^  (see  Definition  5*1)  if  "the 
accelerated  process  is  used. 


I  i  I  4 


mw* 


I 


i 
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Definition  8.1 


For  q  >  1  ,  is  the  positive  real  root  of 


q+2  2 

xH  =  x  +  x  +  1 


(8.5) 


Remarks 


It  is  easy  to  see  that  0^  >  £  ,  and,  corresponding  to  the  bound 


(5.2),  we  have 


39+1  <  p.  <  5* 


If  x  -.5  with  weak  order  0  >  1  then,  by  the  definition  of 
n 

order  (see  Section  2),  for  any  e  >  0  we  eventually  have 


(8.6) 


-log|xn  -  5 1  >  (0  -  e)n  • 


(8.7) 


Thus,  the  number  of  function  evaluations  required  to  reduce  xn  "  ^ 


below  a  very  small  positive  tolerance  is  inversely  proportional  to  log  0 
(assuming  that  approximate  equality  holds  in  (8.7)),  and  the  ratio 

iog  a 

r suggests  how  much  we  gain  by  using  the  accelerated  process, 
xog  p^ 

rather  than  the  unaccelerated  process,  if  very  high  accuracy  is  required. 
From  the  bounds  (5*2)  and  (8.6), 
log  8 


lim  ,  -  log,  2  -  0 . 6309  • • •  > 

q  —cd  pq  J 


(8.8) 


so  there  is  a  37  percent  saving  for  large  q  .  Of  course,  the  only 

practical  interest  is  in  small  values  of  q  ,  and  in  Table  8.1  the 

iog  a 

values  of  0*  ,  0  and  ■; — are  given  for  q  1,2,...,  10  .  The 
0.  Q.  L>  Pq 

entries  for  0^  are  correctly  rounded  to  12  decimal  places,  and  the 
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Table  3.1: 

The  constants 

6'  for  q  = 
q 

1(1)10  to  12D 

r 

q 

@q 

. . 7 

~i 

log  Pa/log  ‘ 

j 

;  l 

1.839286755214 

1.6180 

0.7897 

,  2 

1.465571231877 

1.3247 

0.7357 

1 

3 

I.3247.I7957245 

1.2207 

0.7093 

‘  4 

1.249851588864 

1.1673 

j 

O.6936 

5 

1.203216033518 

1.1347 

0.6832 

6 

1.171321856385 

1.1128 

0.6757 

7 

1.148113497353 

1.0970 

0.6702 

:  8 

1.130459571864 

1.0851 

O.6658 

9 

1.116575158368 

1.0758 

0.6623 

i  10  i 

i-  1 

1.105367322949 

1.0683 

4 

0.6595  ! 

.  .  J 

See  Definition  8.1,  and  the  remarks 

above,  for  a  description 

lQg 


of  the  constants  p*  and  the  significance  of  the  ratio 


i°g 


The  constants  are  given  to  12D  in  Table  5«1. 


a 


■iytf*.  ;i.s  *-?■+  — 


•6  *Lv  >».  )?» 


_ _  i 


other  entries  are  given  to  4  places  (they  are  given  for  comparison 
only:  see  Table  5*1  for  "the  0^  to  12  places) .  The  table  suggests 

c  p  p 

that  0^  =  02  ,  and  this  is  true,  for  x  -  x  -  x  - 1  =  (x  -  x  - 1)  (x  +1)  . 


Theorem  8.1 

Suppose  that  f  e  LC^+1[a,b;M]  ;  £e(a,b);  -  o  ; 

f^(£)  h  0  J  and  xQ,  ...,x^+1  are  (distinct)  points  in  [a,b]  .  If 

x^, ...,x^+^  are  sufficiently  close  to  £  ,  then  a  sequence  (xn)  is 

uniquely  defined  by  equations  (8.2)  to  (8.4),  and  xn  -*  C  with  weak 

order  at  least  0^  (see  Definition  8.1)  as  n  -*  <*>  . 


Proof 


For  n  >1  ,  let  &n  be  the  largest  of  jxn  -  £  J, . . Ix^  -£|  ; 


let  6’  be  the  second-largest;  and  let 
n 


8  =  max  (8  ,  x  ,  -  £  I ) 

n  v  n*  n-1  S|/ 


(8.9) 


If  yn  is  defined  by  equation  (8.2),  then  Lemma  6.1  shows  that 


y„  -  C  *  e  E  (*„«  -  5)  (*. ,  - 1)  +  o(sV) 


0  <i < j <q 


(8.10) 


as  6„  -»  0  ,  where 
n 


,(q+l), 


q(o.+i)  f  q  vY) 


(8.11) 


In  particular,  (8.10)  implies  that 


yn  -  ?  =  0(6n6A) 


(8.12) 


as  -*  0  .  Thus,  for  0  <  i  <  j  <  q  , 


rtvifiiniiii 
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<Vi  ■  yn>  (Vj  -  V  =  <Vi  '  » ■  £)  +  f®-13) 

as  6  -♦  0  . 
n 

If  6n  is  sufficiently  small  then,  since  f^(£)  /  0  ,  we  have 
f[x^, . .  .jX^]  /  0  ,  and,  by  Theorem  2.5.1, 


f[x  . , . .  .,x 
n-17  7  n+q-1 

q.f[x  , . .  .,x  ] 

^  n7  7  n+q 


K+0(6n) 


as  5  -♦  0  . 

n 

If  s^  is  as  in  (8.4),  then  (8.13)  and  (8.l4)  give 


(8.14) 


f[x 

q.f[ 


_  4  •  •  •  «X 
n-1  n+q_ 


x  , 

n7 


n+q 


n 


=  K 


0  <i  <j  <q 


(^•-0(xA.-S)  +  0(6  5  8')  (8.15 

'  n+i  37  v  n+j  5,7  '  n  n  n 


as  &n  -*  0  .  Thus,  from  (8.3)  and  (8.10), 


x  _  —  t  =  0(6  6  6’) 

n+q+1  b  v  n  n  n7 


(8.16) 


as  Sn  -  0  .  This  shows  that,  provided  8^  is  sufficiently  small,  the 
sequence  (x^)  is  uniquely  defined,  lies  in  [a,b]  ,  and  -»  £  as 
n  -•  oo  . 


From  equation  (8.l6),  there  is  a  positive  constant  A  such  that, 
for  all  n  >  1  , 


n+q+1 


-SI  <  Ac 


6  5  6’ 
n  n  n 


(8.17) 


and,  if  61  is  sufficiently  small,  then 

-log(A|xn-5|)  >  (8.18) 

for  n  =  0,  ...,q+l  .  From  equation  (8.17)  and  the  definition  of  ,  we 
see  that  (8.18)  hold*  for  all  n  >  0  ,  by  induction  on  n  .  Thus 
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(8.19) 

i.e.,  the  weak  order  of  convergence  is  at  least  ,  so  the  proof  is 
couplet  e. 


lim  inf(-log|xn  -  5|)n  > 

n  -*  oo  ^ 


9*  Some  numerical  examples 

To  illustrate  the  theoretical  results  obtained  in  Sections  4  to  8, 
we  give  the  following  examples: 


1. 

q  =  1  , 

f(x) 

2  3 

=  x  +  x  +  x'  ,  Xq-2,  x^  =  1  j 

2. 

q  =  2  , 

f(x) 

o  5  4 

=  8  +  6x  +  4x  +  3x  ,  xQ  =  2  ,  x1  =  1  ,  xg  =  0.5  ; 

3. 

q  =  3  , 

f(x) 

=  1+  40x  +  I0x^+  5x^+  3x^  ,  xQ  =  2  ,  x  =  1  , 

X2  = 

0-5  ,  x^  =  0.25  ;  and 

4. 

q  =  4  , 

f(x) 

=  1  +  2x  +  40x2  +  5x^  +  2x^  +  x^  ,  xQ  =  2  ,  x^  =  1  , 

X2  = 

0.5  ,  x^  =  0.25  ,  =  0.125  . 

In  all  these  examples  5=0,  and  the  iterative  process  defined 
by  (1.1)  converges,  even  though  the  initial  values  are  not  very  close 
to  5  *  Apart  from  constant  factors,  the  polynomials  are  obtained  by 
differentiating  the  last  one  (for  q  =  4)  4-q  times,  so  we  are  solving 

the  same  problem  in  four  different  ways. 

Table  9*1  gives  the  sequences  (x^)  produced  by  the  successive 
interpolation  process,  for  the  functions  and  starting  values  given  above. 
To  illustrate  the  superlinear  convergence,  the  entries  are  given  until 
|x  |  <  10  "  ,  although  such  high  precision  would  seldom  be  required  in 

practical  problems.  The  table  also  gives  the  sequences  (x^)  produced 

74 


3*9 

by  the  accelerated  interpolation  process  described  in  Section  8,  with 
starting  values  =  x^  for  i  =  0,  .  ..,q+l  .  As  predicted  by  Theorem  8.1 
and  Table  8.1,  the  accelerated  sequences  converge  appreciably  faster  than 
the  unaccelerated  ones. 

To  verify  relations  (8.12)  and  (8.l6),  the  table  also  gives 


T* 

"n 


xx  . 
n-q  n-q-1 


(9-1) 


and 


r* 


n 


x» 

_ n _ 

X*  Xf  .X*  _ 

n-q  n-q-1  n-q-2 


(9-2) 


when  they  are  defined.  With  a  few  exceptions  near  the  beginning  of  some 

of  the  sequences,  both  (|xn|)  and  ( |  | )  are  monotonic  decreasing,  so 

r  and  r*  should  be  bounded.  From  Lemma  6.1,  we  expect  that 
n  n 


lim  r  =  —  ■ 

i—  “  q(q+Df(q)C5) 


(9-5) 


and  this  is  just  q(q+i)'  for  our  example®' 
proof  of  Theorem  8.1,  we  expect  that 


Similarly,  fran  the 


»(q+  2) 


lim  r*  - - - - - 

n  -*00  n  q(qfl)(q+2)f^q^(0 


(9**0 


and  this  is  just 


The  results  support  these  predictions. 


Table  9*1  was  computed  on  an  IBM  360/91  computer,  with  1^  digit 
truncated  floating-point  arithmetic  to  base  l6.  To  minimize  the  effect 


of  rounding  errors,  we  took  advantage  of  the  fact  that  n-th  divided 

2  n-1 

differences  of  l,x,x  ,...,x  vanish  identically  when  competing  the 
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divided  differences  in  equations  (8.2)  and  (8.3).  Without  this  device, 
it  is  not  possible  to  reduce  jx^j  or  jx^|  to  10 without  using 
higher  precision  arithmetic,  because  of  the  effect  of  rounding  errors 
(except  for  q  =  1)  . 

For  q  =  2  ,  our  example  is  the  same  as  that  used  by  Jarratt  (1967), 
and  our  results  agree  with  his  for  n  <  9  •  For  n  =  10  and  11  our 
results  differ  slightly,  presumably  because  of  rounding  errors.  The 
example  given  by  Jarratt  (1968)  for  q  =  3  has  also  been  verified. 
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Table  9.1:  Numerical  results  for  a  =  1,  2,  3  and  4 


n 

X 

n 

0 

2.000 

1 

1.000 

2 

7.273’ -1 

3 

3.980' -1 

4 

1.983’ -1 

5 

6.727* -2 

6 

1.276* -2 

7 

8.543' -4 

8 

I.090* -5 

9 

9.314* -9 

ID 

1.015’ -13 

n 

9.457*-22 

0 

2.000 

1 

1.000 

2  , 

5.000’-l 

3 

5.162* -1 

4 

2.681' -1 

5 

1.366* -1 

6 

6.978' -2 

7 

2.053’ -2 

8 

4.547’ -3 

9 

6.154* -4 

10 

3.631* -5 

11 

9.956* -7 

12 

7.666*-9 

13 

1.215* -u 

14 

2. 548* -15 

15 

3.104»-20 

16 

1.032* -26 

ill  111 


■"  ! 

X* 

r 

r* 

n 

n 

n 

2.000 

1.000 

7 .273’ -1 

0.3636  ; 

2 .100* -1 

0.5473  | 

0.1444 

4.389* -2 

0.6851  j 

0.2874 

-1.846* -3  ; 

0.8523  i 

-0.2755 

i.22i» -5  : 

0.9568  : 

-0.7178 

1.035* -9 

0.9949 

-1.0455 

2. 350' -17  ; 

0.9998 

-1.0066 

-2. 982* -31 

1.0000  ’ 

-1.0039  ; 

1.0000 

1.0000 

2.000 

1.000 

5.000’-l  ; 

i 

5.l62'-l  ; 

0.2581  ; 

! 

1.219* -1  j 

0.5362  • 

0.1219  ! 

3.271'-2 

0.5291 

0.1267  j 

5.618* -3  : 

0.5042 

0.1786  j 

-3. 363* -4  :■ 

O.5607 

-0.1634 

-3. 484* -6  ; 

0.4772 

-0.1556 

1.325'-8 

0.4296 

-0.2144  | 

-1.728' -12 

O.389O 

-0.2625 

-3.844* -18 

0.3558 

-0.2477 

-2. 008* -26 

0.3^30 

0.3360 

0.3339 

0.3334 
0.3333  i 

-0.2518 

J 

i 

i 

1 

! 

! 
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Table  9*1  ( continued) 


q 

3 


n 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 
23 


n 

2.000 
1.000 
5.000* -1 
2.500*-1 
3.775* -l 
1.8l4'-i 
8.574*-2 
4. 214* -2 
2.268*-2 
5.58o*-3 
1.227* -3 
2.347* -4 
2.809* -5 
1.44l* -6 
5- 518* -8 
1.164* -9 
7-021* -12 
1.354* -14 
1.077* -17 
1.365* -21 


n 

2.000 
1.000 
5.000* -1 
2 . 500* -1 
3-775* -1 
6.882* -2 
1.567 ’-2 
3 -572* -3 
7.222* -4 
-3. 949 ’-5 
-3. 547 ’-7 
-2.893* -9 

8. 630* -12 

-1.067* -15 
4.009* -21 


0.1887 

0.3628 

0.6860 

0.4465 

0.3313 

0.3588 

0.3395 

0.2455 

O.2219 

O.2105 

O.I917 

0.1766 

0.1735 

O.1703 

0.1677 

0.1670 


. -•  • 

2.000 

2.000 

:  1.000 

1.000 

5.000* -1 

5.000»-l 

2 . 500  * -1 

2 . 500 »-l 

1.250 »-l 

1.250* -1 

2.84o*-l 

2.84o*-l 

0.1420 

1.258* -1 

3. 887* -2 

0.2517 

5-453* -2 

7 .030* -3 

0.4362 

2.492» -2 

1.46l» -3 

0.7075 

1.274*. 2 

4.448*-4 

0.3588 

7 • 507  * -3 

l.l68»-4 

0.2101 

1.564*-3 

-4.334* -6 

0.2279 

3. 227* -4 

-2. 390* -8 

0.2374 

6.87l*-5 

-2.370 ’-10 

0.2164 

1.36o*-5 

-2. 500 *-12 

0.1423 

1.545*-6 

9.027' -15 

0.1316 

6.659* -8 

-6. 291* -19 

0.1316 

2.8l4»-9 

1.243* -24 

0.1270 

1.067* -10 

0.1142 

2.207 *-12 

0.1050 

1.073 *-14 

0.1046 

1.944* -17 

o.io4o 

3.069* -20 

0.1022 

2. 367 *-23 

i 

0.1005 

r* 

n 


0.0688 

0.1253 

0.0757 

0.1112 

-0.0970 

-0.0921 

-0.0716 

-0.0847 

-0.1055 

-O.O989 


0.0389 
0.0562 
0.0935 
0.0501 
■  0.0846  j 
-0.0558 
-0.0598 
-O.0519 
-0.0329 
-0.0401 
-0.0520 
-0.0506 
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3.10 


10 .  Surams 


The  main  results  of  this  chapter  for  q  =  1  (successive  linear  inter¬ 
polation  for  finding  a  zero)  and  q  =  2  (successive  parabolic  interpolation 
for  finding  a  turning  point)  are  summarized  below. 

Theorem  5.1 

q  =  1:  If  feC  and  x  -♦  £  ,  then  f(£)  =0  . 

q  =  2:  If  fcC1  and  xn  -*  £  ,  then  f*  (£)  =  0  . 

Theorem  4.1 

q  =  1:  If  fcC^  ,  f*  (£)  /  0  ,  and  a  good  start,  then  super  linear  convergence, 

q  =  2:  If  fcC  ,  f"(£)  /  0  ,  and  a  good  start,  then  superlinear  convergence. 


Theorem  5.1 

q  =  1:  If  feLC1  ,  f*  (£)  /  0  ,  and  a  good  start,  then  weak  o-nier  at 
least  p^  =  1.6l8  ... 

p 

q  =  2:  If  feLC  ,  f"(£)  /  0  ,  and  a  good  start,  then  weak  order  at 

least  P2  =  1.324  ... 

Theorem  7«1 

2 

q  =  1:  If  feLC  ,  f  (£)  /  0  ,  and  a  good  start,  then  either  strong 

order  p^  =  1.6l8...  or  weak  order  at  least  2  . 

X 

q  =  2:  If  feLu  ,  f"(£)  /  0  ,  and  a  good  start,  then  either  strong 

x  +  /r  1/3 

order  P2  =  1.324...  or  weak  order  at  least  (-  » ■  )  =  1.378... 

Theorem  8.1 

2 

q  =  1:  If  feLC  ,  f*  (£)  /  0  ,  and  a  good  start,  then  the  accelerated 

sequence  converges  with  weak  order  at  least  p^  =  1.839... 
q  =  2:  If  feK;  ,  f"(£)  /  0  ,  and  a  good  start,  then  the  accelerated 

sequence  converges  with  weak  order  at  least  =  1.465... 


79 


Chapter  4. 


An  Algorithm  with  Guaranteed  Convergence  for  Finding  a 

Zero  of  a  Function 


*  #  ft  I  \  f 


) 


4.1 


1.  Introduction 


Let  f  be  a  real-valued  function,  defined  on  the  interval  [a,b]  , 
with  f(a)f(b)  <  0  .  f  need  not  be  continuous  on  [a,b]  :  for 
example,  f  might  be  a  limited-precision  approximation  to  some  continuous 

A 

function  (see  Forsythe  (1969)).  We  want  to  find  an  approximation  £  to 

a  zero  £  of  f  ,  to  within  a  given  positive  tolerance  25  ,  by  evaluating 

f  at  a  small  number  of  points.  Of  course,  there  may  be  no  zero  in  [a,b] 

if  f  is  discontinuous,  so  we  shall  be  satisfied  if  f  takes  both 

A  A 

nonnegative  and  nonpositive  values  in  [£-25,  £+25]  n  [a,b]  . 

A 

Clearly,  such  a  £  may  always  be  found  by  bisection  in  about 
log2[  (b-a)/5  ]  steps,  and  this  is  the  best  that  we  can  do  for  arbitrary  f  . 
In  this  chapter  we  describe  an  algorithm  which  is  never  much  slower  than 
bisection  (see  Section  3),  but  which  has  the  advantage  of  super  linear 
convergence  to  a  simple  zero  of  a  continuously  differentiable  function,  if 
the  effect  of  rounding  errors  is  negligible.  This  means  that,  in  practice, 
convergence  Is  often  much  faster  than  for  bisection  (see  Section  4). 

There  is  no  contradiction  here:  bisection  is  the  optimal  algorithm  (in  a 
minimax  sense)  for  the  class  of  all  functions  which  change  sign  on  [a,b]  , 
but  it  is  not  optimal  for  other  classes  of  functions:  e.g.,  C1  functions 
with  simple  zeros,  or  convex  functions  (see  Gross  and  Johnson  (1959)# 

Bellman  and  Dreyfus  (1962),  and  Chemousko  (1970)). 


Dekker's  algorithm 


The  algorithm  described  here  is  similar  to  one,  which  we  call  Dekker's 
algorithm  for  short,  variants  of  which  have  been  given  by  van  Wijngaarden, 
Zonneveld  and  Dijkstrs,  (1963),  Wilkinson  (1967),  Peters  and  Wilkinson  (1969), 
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confusion  if  we  omit  subscripts.  b  is  the  best  approximation  so  far 
to  £  ,  a  is  the  previous  value  of  b  ,  and  £  must  lie  between  b 
and  c  .  (initially  a  =  c  .) 

If  f(b)  =  0  then  we  are  finished.  The  ALGOL  procedure  given  by 
Dekker  (1969)  d.°es  not  recognise  this  case,  and  can  take  a  large  number  of 
small  steps  if  f  vanishes  on  an  interval,  which  may  happen  because  of 
underflow.  This  occurred  with  f(x)  =  x^  on  an  IBM  3^0  computer  . 

If  f(b)  /  0  ,  let  m  =  (c-b)/2  .  We  prefer  not  to  return  with 
£  =  -  (b+c)  as  soon  as  |m|  <  26  ,  for  if  superlinear  convergence  has  set 
in  then  b  ,  the  most  recent  approximation,  is  probably  a  much  better 
approximation  to  £  than  ^(b+c)  is  .  Instead,  we  return  with  £  =  b 
if  |m|  <  6  (so  the  error  is  no  more  than  5  if,  as  is  often  true,  f  is 
nearly  linear  between  b  and  c)  ,  and  otherwise  interpolate  or  extrapolate 
f  linearly  between  a  and  b  ,  giving  a  new  point  i.  (see  later  for 
inverse  quadratic  interpolation.)  To  avoid  the  possibility  of  overflow 
or  division  by  zero,  we  find  i  as  b  +  p/q  ,  and  the  division  is  not 
performed  if  2jp|  >  3|m.q|  ,  for  then  i  is  not  needed  anyway.  The 
reason  why  the  simpler  criterion  Jpl  >  jm.qj  is  not  used  is  explained 
later.  Since  0  <  <  jf(a)j  (see  later),  we  can  safely  compute 

s  =  f(b)/f(a)  ,  p  =  +(a-b)s  ,  and  q  =  +(l-s)  . 


Define  b"  = 


i  if  i  lies  between  b  and  b+^m  ("interpolation”), 
b  +  m  otherwise  ("bisection"), 


f  b"  if  |b-b"  |  >  6  , 

b'  =  < 

|^b  +  6.sigr(m)  otherwise  (a  "step  of  6  ") . 


Dekker' s  algorithm  takes  b'  as  the  next  point  at  which  f  is 
evaluated,  forms  a  new  set  fa,b-c]  from  the  old  set  {b,c,b'}  ,  and 

continues.  Unfortunately,  it  is  easy  to  construct  a  function  f  for  which 
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steps  of  5  are  taken  every  time,  so  about  (b-a)/&  function  evaluations 
are  required  for  convergence.  For  example,  let 


for  a+5  <  x  <  b  , 


tlx)  .  {  -(5=S±)  .2b/8  for 


x  =  a 


(2.1) 


arbitrary  for  a  <  x  <  a+6  . 


The  first  linear  interpolation  gives  the  point  b-6  ,  the  next  (an 

extrapolation)  gives  b-26  ,  the  next  b-36  ,  and  so  on. 

Even  if  steps  of  6  are  avoided,  the  asymptotic  rate  of  '‘onvergenee 

of  successive  linear  interpolation  may  be  very  slow  if  f  has  a  zero  of 

sufficiently  high  multiplicity.  (Hote  that  none  of  the  theorems  of 

Chapter  3,  apart  from  Theorem  3.3*1,  apply  for  a  multiple  zero.)  Suppose 

that  f  e Cn[a,b]  ,  n  >1  ,  £  e  (a,b)  ,  f(£)  =  f»(£)  =  ...  =  f(n_1)(£)  =  0  , 

and  f^n,(£)  /  0  (i.e.,  5  is  a  root  of  multiplicity  n  >  1  ).  If 
xi  -  S 

e  >  0  ,  (- - «r)  e  (e,l-e)  >  and  x  is  sufficiently  close  to  5  , 

x0  "  »  u 

then  successive  linear  interpolation  gives  a  sequence  (x^)  which  converges 

linearly  to  5  .  In  fact,  equation  (3*2.1)  holds  with  p  =  1  and 

K  =  0  \  ,  where  the  constants  0  are  defined  in  Definition 

n-1  q  - 

3. 5*1*  The  proof  is  simple:  if 


ym  x 


(2.2) 


is  the  ratio  of  successive  errors,  then  a  Taylor  series  expansion  of  f 


about  £  gives 


1  -  n’1 

-  ( - +  o(U) 

1  -  y« 


(2*5) 


as  x  -*  £  ,  provided  y  remains  bounded  away  from  1  .  Since  the 
m  m 


/  t 


I  t 


t  *  / 


5  {  I  f  1  i 
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iteration 


Z  =  g(z  )  > 
m+1  nr 


(2.4) 


wherv: 


g(z)  = 


1  -  z 


n-1 


1  -  z 


n 


(2-5) 


has  fixed  point  z  =  B  ,  ,  and 

|gf(z)  I  <  1 


(2.6) 


for  ze(0,l)  ,  the  result  follows  from  Ostrovski  (19 66),  Theorem  22.1. 

An  example  for  which  convergence  is  sublinear  (see  Definition  3*2.2) 
is 


f(x)  = 


0  if  x  =  0  , 

-2 

x.exp(-x~  )  if  x/0  , 


(2*7) 


on  an  interval  containing  the  origin.  This  is  an  extreme  case,  for  f  and 
ell  its  derivatives  vanish  at  the  origin. (As  a  function  of  a  complex 
variable,  f  has  an  essential  singularity  at  the  origin.)  If 


0  <  x±  <  xQ  </2  ,  (2.8) 

then  (x^)  is  a  positive,  monotonic  decreasing  sequence,  and,  by  Theorem 
3*3*1>  its  limit  must  be  0  .  Thus,  successive  linear  interpolation  does 
converge,  but  very  slowly. 

Some  of  the  examples  above  are  rather  artificial,  and  unless  an 
extended  exponent  range  is  used  (see  later)  we  may  be  saved  by  underflow, 
i.e.,  the  algorithm  may  terminate  with  a  '’zero”  as  soon  as  underflow  occurs. 
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Even  80;  it  is  clear  that  convergence  may  occasionally  be  very  slow  if 
Dekker's  algorithm  is  used. 

Our  main  modification  of  Dekker's  algorithm  ensures  that  a  bisection 
is  done  at  least  once  in  every  2,log^(  |b-c  J  / 1)  consecutive  steps. 

The  modification  is  this:  let  e  be  the  value  of  p/q  at  the  step  before 
the  last  one.  If  JeJ  <  b  or  |p/q|  >  rj  )e|  then  we  do  a  bisection, 
otherwise  we  do  either  a  bisection  or  an  interpolation  just  as  in  Dekker’s 
algorithm.  Thus,  JeJ  decreases  by  at  least  a  factor  of  two  on  every 
second  step,  and  when  |e|  <6  a  bisection  must  be  done.  (After  a 
bisection  we  take  e  =  ra  for  the  next  step.)  This  is  why  our  algorithm, 
unlike  Dekker’s,  is  never  much  slower  than  bisection. 

A  simpler  idea  is  to  take  e  as  the  value  of  p/q  at  the  last  step, 
but  practical  tests  show  that  this  slows  down  convergence  for  well-behaved 
functions  by  causing  unnecessary  bisections.  With  the  better  choice  of  e  , 
our  experience  has  been  that  convergence  is  always  at  least  as  fast  as 
for  Dekker's  algorithm  (see  Section  4). 

Inverse  quadratic  interpolation 

If  the  three  current  points  a  ,  b  and  c  are  distinct,  we  can  find 
the  point  1  by  inverse  quadratic  interpolation,  i.e.,  fitting  x  as  a 
quadratic  in  y  ,  instead  of  by  linear  interpolation  using  just  a  and  b  . 
Experiments  show  that,  for  well-behaved  functions,  this  device  saves  about 
0.5  function  evaluations  per  zero  on  the  average  (see  Section  4).  Inverse 
Interpolation  is  used  because  with  direct  quadratic  interpolation  we  have 
to  solve  a  quadratic  equation  for  i  >  and  there  is  the  problem  of  which 
root  should  be  accepted.  Cox  (1970)  gives  another  way  of  avoiding  this 
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problem:  fit  y  as  a  function  of  the  form  p(x)/q(x)  ,  where  p  and  q 
are  polynomials  and  p  has  degree  one.  A  third  possibility  is  to  use  the 
acceleration  technique  described  in  Section  3.8.  (See  also  Ostrowski  (1966), 
Chapter  11.) 

Care  must  be  taken  to  avoid  overflow  or  division  by  zero  when  computing 
the  new  point  i  .  Since  b  is  the  most  recent  approximation  to  the  root  £  , 
and  a  is  the  previous  value  of  b  ,  we  do  a  bisection  if  | f  (b)  |  >■  |  f ( a)  |  . 
Otherwise  we  have  | f ( "b)  j  <  |f(a)  {  <  |f(c)  |  ,  so  a  safe  way  to  find  i  is 
to  compute  ^  =  f(a)/f(c)  ,  r g  =  f(b)/f(c)  ,  r^  =  f(b)/f(a)  , 

P  =  t  r5((c-b)r1(r1-r2)-(b-a)(r2-l))  ,  and  q  =+  (r^l)  (r2-l)  (r^-l)  . 

Then  i  =  b  +  p/q  ,  but  as  before  we  do  not  perform  the  division  unless  it 
is  safe  to  do  so.  (if  a  bisection  is  to  be  done  then  i  is  not  needed 
anyway.)  When  inverse  quadratic  interpolation  is  used  it  is  natural  to 
accept  the  point  i  if  it  lies  between  b  and  c  and  up  to  three-quarters 
of  the  way  from  b  to  c:  consider  the  limiting  case  where  the 

interpolating  parabola  has  a  vertical  tangent  at  c  and  f(b)  =  -f(c)  . 

Thui,  i  will  be  rejected  if  2|p|  >  3 1 (— 2~) ‘q.1  >  which  explains  the 
criterion  discussed  above. 

The  tolerance 

As  in  Peters  and  Wilkinson  (1969)^  "the  tolerance  (26)  is  a 
combination  of  a  lelative  tolerance  (4e)  and  an  absolute  tolerance  (2t)  . 

At  each  step  we  take 

6  =  2e  |b  |  +  t  ,  (2.9) 

where  b  is  the  current  best  approximation  to  £  ,  e  =  macheps  is 

/  1— T 

the  relative  machine  precision  (0  for  t -digit  truncated  floating-point 
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arithmetic  with  base  p  ,  and  half  this  for  rounded  arithmetic),  and  t 
is  a  positive  absolute  tolerance.  Since  6  depends  on  b  ,  which  could 
lie  anywhere  in  the  given  interval,  we  should  replace  6  by  its  positive 
minimum  over  the  interval  in  the  upper  bound  for  the  number  of  function 
evaluations  required.  In  the  ALGOL  procedures  the  variable  tol  is  used 
for  5  . 

The  effect  of  rounding  errors 

The  ALGOL  procedures  given  in  Section  6  have  been  written  so  that 
rounding  errors  in  the  computation  of  i  ,  m  etc.  can  not  prevent 
convergence  with  the  above  choice  of  S  .  The  number  2e  in  (2.9) 
may  be  increased  if  a  higher  relative  error  is  acceptable,  but  it  should 
not  be  decreased,  for  then  rounding  errors  might  prevent  convergence. 

The  bound  for  j£  -  £j  has  to  be  increased  slightly  if  we  take 
rounding  errors  into  account.  Suppose  that,  for  floating-point  numbers 
x  and  y  ,  the  computed  arithmetic  operations  satisfy 


fl(xxy)  =x.y(l+e1) 

(2.10) 

and 

fl(x  +  y)  =  x(l+ e2) +y(l+ E^)  , 

(2.11) 

where 

|e^l  <  e  for  i  =  1,2,3  (see  Wilkinson  (1963)).  Also 

suppose 

that 

fl(|x|)  =  jx|  exactly,  for  any  floating-point  number  x 

.  The 

algorithm  computes  approximations 

m  =  fl(0.5  x  (c-b)) 

(2.12) 

and 

tol  =  fl(2  x  e  x  |b|  +  t) 

(2.15) 
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to  m  and  tol  ,  where  $  lies  between  b  =  £  and  c  ,  and  the  algorithm 
terminates  only  when 

|m|  <  tol  (2.14) 

(unless  f(b)  =  0  ,  when  £  =  £  =  b  ).  Our  assumptions  (2.10)  and  (2.11) 
give 

j5|  >|  (|o-b| -e(|b|+|c|))(l-e)  ,  (2.15) 

and,  similarly, 

tol  <  (2e|bj  +  t)(l+e)5  ,  (2.16) 

so  (2.14)  implies  that 

|c-b|  <  (]~)(2e|b|+t)(l+e)3  +  e(|b|+|c|)  .  (2.17) 

A  A 

Since  IC-CI  <  Ic“b|  and  b  =  5  ,  this  gives 

U -Cl  <  6e  |  e  I  +  St  ,  (2.18) 

2  p 

neglecting  terms  of  order  et  and  e  |5|  •  Usually  the  error  is  less 
than  half  this  bound  (see  above). 

Of  course,  it  is  the  user's  responsibility  to  consider  the  effect  of 
rounding  errors  in  the  computation  of  f  .  The  ALGOL  procedures  only 
guarantee  to  find  a  zero  £  of  the  computed  function  f  to  an  accuracy 
given  by  (2.18),  and  $  may  be  nowhere  near  a  root  of  the  mathematically 
defined  function  that  the  user  is  really  interested  in'. 
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Extended  exponent  range 

In  some  applications  the  range  of  f  may  be  larger  than  is  allowed 

for  standard  floating-point  numbers .  For  example,  f (x)  might  be 

aet(A-xl)  ,  where  A  is  a  matrix  whose  eigenvalues  are  to  be  found. 

In  Section  6  we  give  an  ALGOL  procedure  (zero2)  which  accepts  f(x) 

z(x) 

represented  as  a  pair  (y(x),s(x))  ,  where  f(x)  =  y(x)  .2  '  '  (y  real, 
z  integer) .  Thus,  zero2  will  accept  functions  in  the  same  representation 
as  is  assumed  by  Peters  and  Wilkinson  (1969),  although  zero2  does  not 
require  that  l/l6  <  |y(x)  |  <  1  or  y(x)  =  0  ,  and  could  be  simplified 
slightly  if  this  assumption  were  made. 


3.  Convergence  properties 

If  the  initial  interval  is  [a,b]  ,  assume  that 

b-a  >  5m  ,  (J.l) 

and  let 

k  =  rio6g( (•>-*) /5mn  »  (j-2) 

where  6  is  the  minimum  over  [a,b]  of  the  tolerance 
m  7  J 

6(x)  =  2.macheps.  ]x|  +  t  (3-3) 

(see  Section  2),  and  fx~|  means  the  least  integer  y  >  x  .  By 
assumption  (3*1),  k  >  0.  (If  k  =  0,  procedure  zero  takes  only  two 
function  evaluations. ) 

First  consider  the  bisection  process,  terminating  when  the 
interval  known  to  contain  a  zero  has  length  <26  (so  the  endpoint 
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minimizing  If  I  is  probably  within  5  of  the  zero,  and  certainly 
11  m 

within  26^  ) .  It  is  easy  to  see  that  this  process  terminates  after 
exactly  k+1  function  evaluations  unless,  by  good  fortune,  f  happens 
to  vanish  at  one  of  the  points  of  evaluation  . 

Now  consider  procedure  zero  or  zero2.  If  k  =  1  then  the  procedure 
terminates  after  2  function  evaluations,  one  at  each  end-point  of  the 
initial  interval,  just  like  bisection.  If  k  =  2  then  there  are  2 
initial  evaluations,  and  after  no  more  than  4  more  evaluations  a  bisection 
must  be  done,  for  the  reason  described  in  Section  2.  After  this  bisection, 
which  requires  one  more  function  evaluation,  the  procedure  must  terminate. 
Thus,  at  most  2+5=7  evaluations  are  required.  Similarly,  for  k  >  1  , 
the  maximum  number  of  function  evaluations  required  is 

2+ (5+ 7+ 9+ ...  +  (2k+l))  =  (k+1)2 -2  .  (5.4) 

Since  Dekker*s  algorithm  may  take  up  to  2  function  evaluations  (see 
Section  2),  this  justifies  the  remarks  made  in  Section  1.  Also,  although 
the  upper  bound  (5*4)  is  attainable,  it  is  clear  that  it  is  unlikely  to 
be  attained  except  for  very  contrived  examples,  and  in  practical  tests  our 
algorithm  has  never  taken  more  than  5 (k+1)  function  evaluations  (see 
Section  4) .  This  justifies  the  claim  that  our  algorithm  is  never  much 
slower  than  bisection. 

Super  linear  convergence 

Ignoring  the  effect  of  rounding  errors  and  the  tolerance  6  ,  we  see, 
as  in  Dekker  (1969),  that  the  algorithm  will  eventually  stop  doing  bisections 
when  it  is  approaching  a  simple  zero  5  of  a  function.  Thus, 

temporarily  ignoring  the  improvement  described  in  Section  2,  the  theorems 
of  Chapter  5  are  applicable  (with  q  =  1  ) .  In  particular,  convergence  is 
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superlinear,  in  the  sense  that  lim  sup  |xn~£j  'n  =  0  ,  provided  f 

n  -*  oo 

is  near  the  simple  zero  $  (Theorem  5.4.1).  If  f*  is  Lipschitz 

continuous  near  £  ,  then  the  weak  order  of  convergence  is  at  least 

^(l  +  /f)  =  1.6l8  ...  (Theorem  3.5*1)  •  For  a  summary  of  the  other 

results  of  Chapter  3,  see  Section  3*10. 

If  f*  is  Lipschitz  continuous  near  the  simple  zero  £  ,  then,  even 

with  the  inverse  parabolic  interpolation  modification  described  in  Section  2, 

the  weak  order  of  convergence  is  still  at  least  jj(l  +  /5)  .  The  idea  of 

the  proof  is  that,  by  Lemma  2.5*1,  the  curvature  at  £  of  the  approximating 

parabolas  is  bounded,  so  the  inequality  (3  *5  *13)  still  holds  for  some  M 

(no  longer  the  Lipschitz  constant)  and  sufficiently  small  &n  . 

Thus,  our  procedure  always  converges  in  a  reasonable  number  of 

steps  and,  under  the  conditions  mentioned  above,  convergence  is  superlinear 

with  order  at  least  1.6l8  ....  It  is  well-known  that,  since 
2 

(l.6l8...)  =  2.6l8...  >  2  ,  this  conpares  favorably  with  Newton’s  method 

if  an  evaluation  of  f*  is  as  expensive  as  an  evaluation  of  f  .  In 
practice,  convergence  for  well-behaved  functions  is  fast,  and  the  stopping 
criterion  is  usually  satisfied  in  a  few  steps  once  superlinear  convergence 
sets  in. 

Summary 

The  results  of  Sections  2  and  3  above  may  be  summarized  in  the  following 
''theorem” : 

If  a  <  b  ,  e  =  macheps  >0,  t  >  0  ,  f  is  defined  on  [a,b]  , 

f(a)f(b)  <  0  ,  and  arithmetic  is  exact,  then  the  algorithm  defined  by 

A 

procedure  zero  (see  Section  6)  converges,  and  returns  £e[a,b)  such  that 
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f  changes  sign  in  Ig  =  [£-25,  £+25]  fl  [a,b]  ,  where  6  =  2ej£|+t  , 
and  the  number  n  of  times  that  f  is  evaluated  does  not  exceed 
(k+l)2-2  ,  where  k  is  given  by  equation  (3.2).  Also,  if  f  eC^[a,b] 
has  a  unique  simple  zero  5  €  (a>b)  >  then  |C-S|1/n-o  as  macheps 
and  t  -•  0  .  Finally,  if  arithmetic  is  approximate,  but  satisfies  (2. ID) 
and  (2.11)  with  £  <  10  ^  ,  then  the  algorithm  still  converges,  and 
returns  £  such  that  f  changes  sign  in  Ig,  ,  where  5‘  =  1.01(3e |£|+t)  • 

2  A 

(The  factor  1.01  takes  care  of  terms  of  order  et  and  6  Ul  •) 


4.  Practical,  tests 

The  ALGOL  procedures  zero  (for  standard  floating-point  numbers)  and 

zero2  (for  floating-point  with  an  extended  exponent  range)  have  been 

tested  using  ALGOL  W  (Wirth  and  Hoare  (1966),  Bauer,  Becker  and  Graham  (1968)) 

-13 

on  an  IBM  360/67  and  a  360/91  with  machine  precision  16  .  The  number 

of  function  evaluations  for  convergence  has  never  been  greater  than  three 
times  the  number  required  for  bisection,  even  for  the  functions  mentioned 
in  Section  2,  and  for  the  functions  given  by  (2.1)  and  (2.7)  Dekker's 
algorithm  takes  more  than  10^  function  evaluations.  Zero2  has  been 
tested  extensively  with  eigenvalue  routines,  and  in  this  application  it 
usually  takes  the  same  or  one  less  function  evaluation  per  eigenvalue  than 
Dekker's  algorithm,  and  considerably  less  than  bisection. 

In  Table  4.1,  we  give  the  number  of  function  evaluations  required 
for  convergence  with  procedure  zero2  and  functions  x^  ,  x^  ,  f^(x)  , 
and  fg(x)  ,  where 
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Table  4.1:  The  number  of  function  evaluations  for  convergence  with 
procedure  zero2 


f(x) 

a 

b  |  t 

c  -  i 

function  evals . 

x9 

-1.0 

+1.1 

l«-9 

4.99»-io 

81 

!  ; 

x9 

-1.0 

+4.0 

1*  -20 

j 

4.92* -21 

189 

x1* 

-1.0 

+4.0 

1* -20  i 

4.8l*-21 

•  l 

195 

M*) 

-1.0 

+4.0 

1* -20  ; 

* 

0 

j  33  i 

f2(x)  .  -1001200 

0 

.  I 

1*  -20 

.  .  L 

l'-9 

79  J 

a 

*  6  = 

2.17* -4  and 

=  o  . 

For  a  definition  of  f^  ,  f^  etc.,  and  a  discussion,  see  above. 
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Table  4.2:  Comparison  of  Dekker's  procedure  with  procedure  zero 


1—  •  • 

j  k 

r~ 

. ]” 

\ 

"1 

1.05838256968867 

10 

10 

* 

i  2 

1.2399500536075^ 

10 

9 

i 

i  3 

1.56239614624727 

10 

10 

i 

!  4 
• 

1 

2.05025255169417 

10 

10 

5 

1 

2.72832495649769 

11 

10 

6 

i 

3.61410919225782 

11 

10 

!  7 

4. 71048321337 581 

10 

10 

8 

6.00000000000000 

9 

9 

i 

9 

7.44175272160161 

10 

9 

10 

8.97167724536908 

10 

10 

11 

10.5063081987721 

10 

10 

12 

n.  9497474683053 

10 

9 

13 

13.2029707184829 

10 

9 

14 

1 

14.1742635087655 

i 

10 

9 

15 

1 

14.7893764953339  ! 

9 

8 

i  ...  i  _ 


For  a  definition  of  ,  n^  and  ,  see  above.  The  have  a 
relative  error  of  less  than  5,_1^* 
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For  each  eigenvalue,  the  tolerances  for  Dekkerfs  procedure  and  for  procedure 

/ 

zero  were  the  same.  (The  tolerance  was  adjusted  by  the  eigenvalue  program 

to  ensure  that  the  computed  eigenvalues  had  a  relative  error  of  less 
-l4 

than  5*10  •)  Tests  were  run  for  several  values  of  n  ,  p  ,  q  and  r  : 

t.ie  table  gives  a  typical  set  of  results  for  n  =  15  ,  p=7>  q  =  7/*+  > 

and  r  =  l/2  .  To  obtain  the  same  accuracy  with  bisection,  at  least  I4O 
function  evaluations  per  eigenvalue  would  be  required,  so  both  our  procedure 
and  I'ekker's  are  at  least  four  times  as  fast  as  bisection  for  this  application. 

Seme  more  experimental  results  are  given  in  Chapter  5.  (For  an 
illustration  of  the  superlinear  convergence,  see  the  examples  given  in 
Section  3.9*) 


5.  Conclusion 

Our  algor ithm  appears  to  be  at  least  as  fast  as  Dekker’s  on  well- 
behaved  functions,  and,  unlike  Pekker's,  it  is  guaranteed  to  converge  m  a 
reasonable  number  of  steps  for  any  function.  The  ALGOL  procedures  zero 
and  zero2  given  in  Section  6  have  been  written  to  avoid  problems  with 
rounding  errors  or  overflow,  and  floating-point  underflow  is  not  harmful 
as  long  as  the  result  is  set  to  zero. 

Before  giving  the  ALGOL  procedures  zero  and  zero2,  we  briefly  discuss 
sane  possible  extensions. 
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Cox's  algorithm 

A  recent  paper  by  Cox  (1970)  gives  an  algorithm  which  combines 
bisection  with  interpolation,  using  both  f  and  f*  .  This  algorithm 
may  fail  to  converge  in  a  reasonable  number  of  steps  in  the  same  way 
as  Dekker's.  A  simple  modification,  exactly  like  the  one  that  we  have  given 
in  Section  2  for  Dekker's  algorithm,  will  remedy  this  defect  without 
slowing  the  rate  of  convergence  for  well-behaved  functions. 

Parallel  algorithms 

In  this  chapter  we  have  considered  only  serial  algorithms.  It  is 
well-known  (see,  for  example,  Traub  (1964))  that  all  serial  methods  which 
use  only  function  evaluations  and  Lagrangian  interpolation  polynomials 
have  weak  order  less  than  2  ,  unless  certain  relations  hold  between  the 
derivatives  of  f  at  5  •  (Winograd  has  recently  shown  that  no  serial 
method,  using  only  function  evaluations,  can  have  order  greater  than  2 
for  all  analytic  functions  with  simple  zeros.)  Thus,  nothing  much  can  be 
gained  by  going  beyond  linear  or  quadratic  interpolation.  However, 

Miranker  (1969)  has  shown  that,  if  a  parallel  computer  is  available,  a 
class  of  algorithms  using  Lagrangian  interpolation  polynomials  gives 
superlinear  convergence  with  weak  order  greater  than  2  under  certain 
conditions.  Also,  it  is  clearly  possible  to  generalize  the  bisection 
process  to  "(r+l) -section"  with  advantage  if  a  parallel  computer  with  r 
independent  processors  is  available.  See,  for  example,  Wilde  (196k). 

There  does  not  appear  to  be  any  fundamental  difficulty  in  combining 
generalized  bisection  with  one  of  Miranker' s  parallel  algorithms  so  that 
convergence  in  a  reasonable  number  of  steps  is  guaranteed  for  any  function, 
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and  superlinear  convergence  with  order  greater  than  2  is  likely  for 
well-behaved  functions. 


Searching  an  ordered  file 


A  problem  which  is  commonly  solved  by  a  binary  search  (i.e.,  bisection) 


method  is  that  of  locating  an  element  in  a  large  ordered  file.  The  problem 
may  be  formalized  in  the  following  way.  Let  S  be  a  (finite  or  infinite) 


totally  ordered  set,  and  q>:  S  -»  R  an  order-preserving  mapping  from  S 

into  the  real  numbers.  Suppose  that  T  =  (t  ,t  ,  . .  .,tn]  a 

subset  of  S  ,  with  t_.  <  t.  <  . . .  <  t  .  Given  c  e  [cp(t^),cp(t  )  ]  ,  we 
7  q  2.  n  x  0  n 

may  define  a  monotonic  function  f  on  [0,n]  by 


f(x)  =  -  c  ,  (5-1) 

where  x  e  [  0,  n  ]  and  i  =  r*  -  si  •  Thus,  finding  an  index  i  such 
that  q>(t^)  =  c  is  equivalent  to  finding  a  zero  of  f  in  [0,n]  ,  and 
our  zero-finding  algorithm  could  be  used  instead  of  the  usual  bisection 
algorithm.  It  might  be  worthwhile  to  modify  our  algorithm  slightly,  so 
as  to  take  the  discrete  nature  of  the  problem  into  account .  A  related 
application  of  our  algorithm  is  in  finding  the  median  (or  other  percentiles) 
of  a  list  of  numbers,  but  there  are  faster  ways  of  doing  this. 


ALGOL  60  procedures 


The  ALGOL  procedures  zero  (for  standard  floating-point  numbers)  and 
zero2  (for  floating-point  with  an  extended  exponent  range)  are  given  below. 
For  a  description  of  the  idea  of  the  algorithm,  see  Section  2.  Some 
test  cases  and  numerical  results  are  described  in  Section  k. 
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Procedure  zero 


real  procedure  zero  (a,  b,  macheps,  t,  f) ; 
value  a,  b,  macheps,  t;  real  a,  b,  macheps,  t; 
real  procedure  f ; 
begin  comment : 

Zero  returns  a  zero  x  of  the  function  f  in  the  given  interval  [a,,b], 
to  within  a  tolerance  6.macheps.  jxj+2.-t,  where  macheps  is  the  relative 
machine  precision  and  t  is  a  positive  tolerance.  The  procedure  assumes 
that  f(a)  and  f(b)  have  different  signs; 
real  c,  d,  e,  fa,  fb,  fc,  tol,  m,  p,  q,  r,  s; 
fa  :=  f (a) ;  fb  :=  f (b) ; 
int:  c  :=  a;  fc  :=  fa;  d  :=  e  :=  b-a; 
ext:  if  abs(fc)  <  abs(fb)  then 
begin  a  :=  b;  b  :=  c;  c  :=  a; 
fa  :=  fb;  fb  :=  fc;  fc  :=  fa 


end; 

tol  :=  2  x  macheps  x  abs(b)  +  t;  m  :=  0.5  X  (c-b) ; 
if  abs(m)  >  tol  A  fb  /  0  then 

begin  comment ;  See  if  a  bisection  is  forced; 
if  abs(e)  <  tol  v  abs(fa)  <  abs(fb)  then  d  :=  e  :=  m  else 
begin  s  :=  fb/fa;  if  a  =  c  then 

begin  comment:  Linear  interpolation; 
p:=2xmxs;  q:=  1-s 
end 
else 

begin  comment:  Inverse  quadratic  interpolation; 
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q  :=  fa/fc;  r  :=  fb/fc; 
p  :=  sx(2xmxqx  (q-r)  -  (b-a)  x  (r-l)) ; 
q  :=  (q-i)  x  (r-l)  x  (s-1) 
end; 

if  p  >  0  then  q  :=  -q  else  p  :=  -p; 
s  :=  e;  e  :=  d; 

if2xp<3xmxq  -  abs(tol  x  q)  A  p  <  abs(0*5  X  s  x  q)  then 

d  :=  p/q  else  d  :=  e  :=  m 

end; 

a  :=  b;  fa  : =  fb ; 

b  :=  b+  (if  abs(d)  >  tol  then  d  else  if  m  >  0  then 
tol  else  -tol) ; 
fb  :=  f(b) ; 

go  to  if  fb  >  0  =  fc  >0  then  int  else  ext 
end; 

zero  :=  b 
end  zero; 


Procedure  zero2 
real  procedure  zero2  (a, 
value  a,  b,  macheps,  t; 
begin  comment ; 

Zero2  finds  a  zero 
zero,  except  that  the 
so  that  f(x)  =  y.2z. 
a  very  large  function 


b,  macheps,  t,  f ) ; 

real  a,  b,  macheps,  t;  procedure  f; 

of  the  function  f  in  the  same  way  as  procedure 
procedure  f(x,y, z)  returns  y  (real)  and  z  (integer) 
Thus  underflow  and  overflow  can  be  avoided  with 
range; 
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real  procedure  pwr2  (x,n) ;  value  x,  n;  real  x;  integer  n; 
comment :  The  procedure  is  machine-dependent.  It  computes  x.2“  for 
n  <  0,  avoiding  underflow  in  the  intermediate  results; 
pwr2  :=  if  n  >  -200  then  x  x 2  t  n  else 
if  n  >  -400  then  (x  x  2  t  ( -200) )  x  2  t  (n+200)  else 
if  n  >  -600  then  ((x  x  2  t  (-200))  X  2  t  (-200))  x  2  T  (n+400)  else  0; 
integer  ea,  eb,  ec; 

real  c,  d,  e,  fa,  fb,  fc,  tol,  m,  p,  q,  r,  s; 
f(a,fa,  ea);  f(b,fb,eb); 

int:  c  :=  a;  fc  :=  fa;  ec  :=  ea;  d  :=  e  :=  b-a; 
ext:  if  (ec  <  eb  a  pwr2(abs(fc) ,  ec-eb)  <  abs(fb)) 

V  (ec  >  eb  a  pwr2(abs(fb) ,  eb-ec)  >  abs(fc))  then 


4.^ 


begin 

a  :=  b; 

ii 

fb; 

ea 

:=  eb; 

b  :=  c; 

fb  :  = 

fc; 

eb 

:=  ec; 

c  :=  a; 

fc  :  = 

fa; 

ec 

:=  ea 

end; 

tol  : =  2  ; 

<  mac heps ) 

<  abs(b) 

1  +  t; 

ra 

:=  0.5 

if  abs(m) 

>  tol  A 

fb  /  0 

then 

begin 

if  abs(e) 

<  tol 

V 

(ea  <  eb  a  pwr2(abs(fa),  ea-eb)  <  abs(fb))  v 
(ea  >  eb  a  pwr2(abs(fb) ,  eb-ea)  >  abs(fa))  then 
d  :=  e  :=  m  else 


begin  s  :=  pwr2(fb,  eb-ea)/fa;  if  a  =  c  then 
begin  p  :=2xmxs;  q  :=  1-s  end 
else 


begin  q  :=  pwr2(fa,  ea-ec)/fc; 
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r  :=  pwr2(fb,  eb-ec)/fc; 
p  :=sx(2xmxqx  (q-r)  -  (b-a)  x  (r-l)) ; 
q  :=  (q-1)  x  (r-l)  x  (s-l) 
end; 

if  p  >  0  then  q  :=  -q  else  p  :=  -p;  s:=e;  e:=d; 

if  2xp  <  3  xmxq  -abs(tolxq)  A  p  <  abs(0.5xsxq)  then 
d  :=  p/q  else  d  :=  e  :=  m 
end; 

a  :=  b;  fa  :=  fb;  ea  :=  eb; 

b  :  =  b+  (if  abs(d)  >  tol  then  d  else  if  m  >  0  then 
tol  else  -tol) ; 
f(b,  fb,  eb) ; 

go  to  if  fb  >  0  a  fc  >0  then  int  else  ext 
end; 

zero2  :=  b 
end  zero2; 
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Chapter  5 

An  Algorithm  with  Guaranteed  Convergence  for  Finding 
Minimum  of  a  Function  of  One  Variable 


1 .  Int  roduc  t  ion 


A  common  computational  problem  is  finding  an  approximation  to  the 
minimum  or  maximum  of  a  real-valued  function  f  in  some  interval  [ a, b  ]  . 
This  problem  may  arise  directly  or  indirectly.  For  example,  many  methods 
for  minimizing  functions  g(x)  of  several  variables  need  to  minimize 
functions  of  one  variable  of  the  form 

7(M  =  g(xQ  +  >ub)  ,  (1.1) 

where  xQ  and  s  are  fixed  (a  "one-dimensional  search"  from  xQ  in 
the  direction  s  ) .  In  this  chapter,  we  give  an  algorithm  which  finds 
an  approximate  local  minimum  of  f  by  evaluating  f  at  a  small  number 
of  points.  There  is  a  clear  analogy  between  this  algorithm  and  the 
algorithm  described  in  Chapter  4  for  root-finding  (see  Diagram  4.1) . 

Unless  f  is  unimodal  (Section  3),  the  local  minimum  may  not  be  the  global 
minimum  of  f  in  [a,b]  ,  and  the  problem  of  finding  global  minima  is 
left  until  Chapter  6. 

The  algorithm  described  in  this  chapter  could  be  used  to  solve  the 
problem  (1.1),  but,  for  this  application,  it  may  be  more  economical  to 
use  special  algorithms  which  make  use  of  any  extra  information  which  is 
available  (e.g.,  estimates  of  the  second  derivative  of  y  ),  and  which  do 
not  attempt  to  find  the  minimum  very  accurately.  This  is  discussed  in 
Chapter  7*  Thus,  a  more  likely  practical  use  for  our  algorithm  is  to  find 
accurate  minima  of  naturally  arising  functions  of  one  variable. 

In  Section  2  we  consider  the  effect  of  rounding  errors  on  any 
minimization  algorithm  based  entirely  on  function  evaluations.  Unimodality 
is  defined  in  Section  3,  and  we  also  define  "6 -unimodality"  in  an  attempt 
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5.1 

to  explain  why  methods  like  golden  section  search  work  even  for  functions 
which  are  not  quite  unimodal  (because  of  rounding  errors  in  their 
computation,  for  example) .  In  Sections  4  and  5  we  describe  a  minimization 
algorithm  analogous  to  the  zero-finding  algorithm  of  Chapter  4,  and  sane 
numerical  results  are  given  in  Section  6.  Finally,  some  possible  extensions 
are  described  in  Section  7 >  and  an  ALGOL  60  procedure  is  given  in 
Section  8. 

Reduction  to  a  zero-finding  problem 

If  f  is  differentiable  in  [a,b]  ,  a  necessary  condition  for  f 
to  have  a  local  minimum  at  an  interior  point  p  e  (a,b)  is 

f 1  (n)  ~  0  .  (1.2) 

There  is  also  the  possibility  that  the  minimum  is  at  a  or  b  :  for 
example,  this  is  true  if  f*  does  not  change  sign  on  [a,b]  .  If  we 
are  prepared  to  check  for  this  possibility,  one  approach  is  to  look  for 
zeros  of  f*  .  If  f*  has  different  signs  at  a  and  b  ,  then  the 
algorithm  of  Chapter  4  might  be  used  to  approximate  a  point  n  satisfying 
(1.2). 

Since  f*  vanishes  at  any  stationary  point  of  f  ,  it  is  possible 
that  the  point  found  is  a  maximum,  or  even  an  inflexion  point,  rather  than 
a  minimum.  Thus,  it  is  necessary  to  check  whether  the  point  found  is  a 
true  minimum,  and  continue  the  search  in  sane  way  if  it  is  not. 

If  it  is  difficult  or  impossible  to  compute  f*  directly,  we  could 
approximate  f’  numerically  (e.g.,  by  finite  differences),  and  search 
for  a  zero  of  f’  as  above.  However,  a  method  which  does  not  need  f’ 
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seems  more  natural,  and  could  be  preferred  for  the  following  reasons: 

1.  It  may  be  difficult  to  approximate  f*  accurately  because  of 
rounding  errors; 

2.  A  method  which  does  not  need  f*  may  be  more  efficient  (see  below); 
and 

3*  Whether  f*  can  be  computed  directly  or  not,  a  method  which  avoids 
difficulty  with  maxima  and  inflexion  points  is  clearly  desirable. 


Jarratt's  method 

Jarratt  (1967)  suggests  a  method,  using  successive  parabolic 
interpolation,  which  is  a  special  case  of  th*  iteration  analyzed  in 
Chapter  3-  With  arbitrary  starting  points  Jarratt's  method  may  diverge, 
or  converge  to  a  maximum  or  inflexion  point,  but  this  need  not  be  fatal  if 
the  method  is  used  in  combination  with  a  safe  method  such  as  golden  section 
search,  in  the  same  way  as,  in  Chapter  4,  we  used  a  combination  of 
successive  linear  interpolation  and  bisection  for  finding  a  zero.  Theorem 
3.5*1  shows  that,  if  f  has  a  Lipschitz  continuous  second  derivative  which 
is  positive  at  an  interior  minimum  p  ,  then  Jarratt's  method  gives 
superlinear  convergence  to  p  with  weak  order  at  least  =  1.32i47».« 

(see  Definitions  3.2.1  and  ^.^.1),  provided  the  initial  approximation  is 
good  and  rounding  errors  are  negligible. 

Let  us  compare  Jarratt's  method  with  one  of  the  alternatives: 
estimating  f*  by  finite  differences,  and  then  using  successive  linear 
interpolation  to  find  a  zero  of  f*  .  (This  process  may  also  diverge, 
or  converge  to  a  maximum.)  Suppose  tnat  f"(p)  >0  and  f^'(p)  /  0  ,  to 


I 

% 

I 

I 

i 

! 

t 
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avoid  exceptional  cases  (see  Sections  3 .6,  3*7  and  ^.2).  Since  at  least 
two  function  evaluations  are  needed  to  estimate  f*  at  any  point,  and 
\/l.6l8. . .  =  1.272...  <  1.32*4...  ,  Jarratt’s  method  has  a  slightly 
higher  order  of  convergence.  (The  comparison  is  similar  to  that  between 
Newton* s  method  and  successive  linear  interpolation  if  an  evaluation  of 
f*  is  as  expensive  as  an  evaluation  of  f  :  see  Golab  (1966)  or 

Ostrovski  (1966).) 


2.  Fundamental  limitations  because  of  rounding  errors 

2 

Suppose  that  f  €  LC  [a,b;M]  has  a  mininmm  at  (i  e  (a,b)  .  Since 
f*  (n)  =  0  ,  Lemma  2.3*1  gives,  for  xe[a,b]  , 

f(x)  '  \  fJJ(x-n)2  +  -f  (x-fi)5  ,  (2.1) 

where  |m  |  <  M  ,  f  =  f(p)  ,  and  f ''  =  f"(n)  .  Because  of  rounding 
errors,  the  best  that  can  be  expected  if  single -precis  ion  floating-point 
numbers  are  used  is  that  the  computed  value  fl(f(x))  of  f(x)  satisfies 
the  (nearly  attainable)  bound 

fl(f(x))  =  f(x)(l+ex)  ,  (2.2) 

where 

l£xi  5  e  ,  (2.3) 

and  e  is  the  relative  machine  precision  (see  Section  U.2).  The  error 
bound  is  unlikely  to  be  as  good  as  this  unless  f  is  a  very  simple 
function,  or  is  evaluated  using  double-precision,  and  then  rounded  01 
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truncated  to  single -precis  ion. 

Let  8  be  the  largest  number  such  that,  according  to  equations 
(2.2)  and  (2.3),  it  is  possible  that 

fl(f(ji  +  6))  <  fQ  .  (2.h) 

It  is  unreasonable  to  expect  any  minimization  procedure,  based  on 

A 

single-precision  evaluations  of  f  ,  to  return  an  approximation  n  to 
H  with  a  guaranteed  upper  bound  for  |h*p|  less  than  8  .  This  is 
so,  regardless  of  whether  the  computed  values  of  f  are  used  directly, 
as  in  Jarratt*s  method,  or  indirectly,  as  in  the  other  method  suggested 
in  Section  1.  The  reason  is  simply  that  the  minimum  of  the  computed 
function  fl(f(x))  may  lie  up  to  6  from  the  minimum  n  of  f(x)  : 
see  Diagram  2.1. 


Diagram  2.1:  The  effect  of  rounding  errors 
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If  f q  >  0  ,  equations  (2.1)  to  (2.4)  give 
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Thus,  if  p.  /  0  and  the  term  is  negligible,  an  upper  bound 
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for  the  relative  error  could  hardly  be  less  than  \  — 7- - 

^  1  ^ 
and  full  single-precision  accuracy  in  [1  is  unlikely  unless 
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is  of  order  e  or  less,  although  fl(f(p))  may  agree  with  f(p) 


to  full  single-precision  accuracy.  (See  also  Pike,  Hill,  and  James  (1967).) 

If  ff  has  a  simple  analytic  representation,  then  it  may  be  easy  to 
compute  f’  accurately.  For  example,  perhaps 


fl(f’W)  -  f'(x(lHi))(l*Ej)  , 


(2.6) 


where  |e^|  <  e  and  |e^|  <  e  ,  so  we  can  expect  to  find  a  zero  of  f * 
with  a  relative  error  bounded  by  e  (see  Lancaster  (1966)  and  Ostrowski 
(1967b)).  If  (2.6)  holds  it  might  be  worthwhile  to  use  the  algorithm 
described  in  Chapter  ’+  to  search  for  a  zero  of  f*  ,  or  at  least  use  it  to 

A 

refine  the  approximation  p  given  by  a  procedure  using  only  evaluations 
of  f  .  However,  this  is  not  so  if  f*  has  to  be  approximated  by 
differences,  for  then  (2.6)  can  not  be  expected  to  hold. 

Even  if  f(x)  is  a  ’inimodal  function,  the  computed  approximation 
fl(f(x))  will  not  be  unimodal,  because  of  rounding  errors.  Note  that 
fl(  f(x))  must  be  constant  over  small  intervals  of  real  numbers  x  which 
have  the  same  floating-point  approximation  fl(x)  .  In  the  next  section 
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we  define  "5 -unimodality"  to  circumvent  this  difficulty. 

Fran  now  on,  we  consider  the  problem  of  approximating  the  minimum 
of  the  computed  function,  or,  equivalently,  we  ignore  rounding  errors 
in  the  computation  of  f  .  The  user  should  bear  in  mind  that  the  minimum 
of  the  computed  function  may  differ  from  the  minimum  that  he  is  really 
interested  in  by  as  much  as  6  (see  equation  (2.5)  above).  In  particular, 
there  is  no  point  in  wasting  function  evaluations  by  finding  the  minimum 
of  the  computed  function  to  excessive  accuracy,  and  our  procedure  localmin 
(Section  8)  should  not  be  called  with  the  parameter  "eps"  much  less  them 


There  are  several  different  definitions  of  a  unimodal  function  in  the 
literature.  One  source  of  confusion  is  that  the  definition  may  depend  on 
whether  the  function  is  supposed  to  nave  a  unique  minimum  or  a  unique 
maximum  (we  always  consider  minima).  Kowalik  and  Osborne  (1968)  say  that 
f  is  unimodal  on  [a,b]  if  f  has  only  one  (no  more  than  one?)  stationary 
value  on  [a,b]  .  This  definition  has  two  disadvantages:  first,  it  is 
meaningless  unless  f  is  differentiable  on  [a, b]  ,  but  we  would  like  to 
say  that  Jx|  is  unimodal  on  [-1,1]  .  Second,  functions  which  have 
inflexion  points  with  a  horizontal  tangent  are  prohibited,  but  we  would 
like  to  say  that  f(x)  =  x  -  3x^  +  5x^  is  unimodal  on  [-2,2]  (here 
f  (+1)  =  f"(+l)  -  0  ). 

Wilde  (196b)  gives  another  definition:  f  is  unimodal  on  [  a,  b  ]  if, 

for  all  x^, x [a,b]  , 
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Xx  <  X2  3  (Xg  <  X*  D  f(xx)  >  f(x2))  A  (xx  >  X*  D  f^)  <  f (xg) )  ,  (3.1) 

* 

where  x  is  a  point  at  which  f  attains  its  least  value  in  [a,b]  . 

(We  have  reversed  seme  of  Wilde’s  inequalities  as  he  considers  maxima 
rather  than  minima.)  Wilde’s  definition  does  not  assume  differentiability, 
or  even  continuity,  but  to  verify  that  a  function  f  satisfies  (3-1)  we 
need  to  know  the  point  x  (and  such  a  point  must  exist) .  Hence,  we 
prefer  the  following  definition,  which  is  nearly  equivalent  to  Wilde’s 
(see  Lemma  3.1),  but  avoids  any  reference  to  the  point  x  •  The 
definition  is  not  as  complicated  as  it  looks:  it  merely  says  that  f  can 

not  have  a  "hump''  between  any  two  points  x^  and  x2  in  [a,b]  .  Two 

*  /  \ 

possible  configurations  of  the  points  xQ,  x^,  x^  and  x  in  (3-1)  and 
(3.2)  are  illustrated  in  Diagram  3.1* 


x 

Diagram  3.1:  Unimod&l  functions 
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Definition  3.1 

f  is  unimodal  on  [a,b]  if,  for  all  x  ,  and  x^e  [a,b]  , 

x0  <  xi  A  xi  <  x2  3  (f(xQ)  <  f(xx)  3  f(x1)  <  f(x2))  A 

(f(x1)  >  f(x2)  3  f(xQ)  >  .  (3.2) 


Lemma  3«1 

x- 

If  a  point  x  at  which  f  attains  its  minimum  in  [a,b]  exists, 
then  Wilde's  definition  of  unimodality  and  Definition  3.1  are  equivalent. 


Proof 

Suppose  that  f  is  unimodal  according  to  Definition  3.1.  If  x  <  x2 

X-  X 

and  x?  <  x  ,  take  x^  =  x^  ,  x^  =  xg  ,  and  x£  -  x  .  Since  f  attains 

x- 

its  least  value  at  x  , 

f(xi)  >  f(x*)  =  f(x2)  t  (5*3) 


so  equation  (3*2)  with  primed  variables  gives 


f(x^)  >  f(xy  , 

and  thus 


fCx^  >  f(Xg)  . 

* 

Similarly,  if  x  <  x2  and  x^  >  x  ,  equation  (3.2)  gives 
f(x1)  <  f(x2)  . 


(3-M 


(3.5) 


(3.6) 
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Thru#,  trrm  and  (3.^),  equation  (3.1)  hold*. 

fUnn»!t**ihJ t  frrppfj**!  tr&b  (3.1)  hoi d#  and  xr<x^<x^.  If 
t( x,f)  <  f(*j)  th*fi  there  are  three  possibilities,  depending  on  the 
poifliicmf  of  x**  * 


1. 

Xj  >  X  .  Thus,  by  (3.1)  , 

tty)  <  Tty)  . 

(3.7) 

a. 

Xj  =■  x  «  lake  xj  =  ^  (x]  +  x?)  ,  and 

x»  »  xg  . 

Since 

x*  <  x*  <  x/,  ,  equation  (3.1)  with  primed  variables  gives 

tty)  <  tty)  , 

(3.8) 

so 

f(xx)  -  f(x*)  <  f(xp  <  Tty)  -  f(x?)  . 

(3-9) 

* 

* 

3. 

x^  <  x  .  Take  •»  x()  and  x*0  *  xx  . 

Since  <  xi,  <  x  , 

equation  (3*1)  gives  f(x’)  >  f(x,',)  ,  contradicting  the  assumption  that 
f(x())  <  f(Xj)  .  Hence  ease  5  is  Impossible,  and,  by  (5*7)  and  (3*9),  we 
always  have  f(*x)  <  f(x?)  •  Similarly,  if  f(x.L)  >  f(x2)  then 
f(x())  >  f(Xj)  ,  so  equation  (3.2)  holds,  and  the  proof  is  complete. 

A  simple  corollary  of  lemma  3.1  Is  that,  if  f  is  continuous,  then 
Wilde's  definition  of  unimodality  and  ours  are  equivalent.  For  arbitrary 
f  the  definitions  are  not  equivalent.  For  example, 


f(x) 


1  -X 


if  x  <  0  , 


x  >  0 


1 14 


x 


If 


(3.10) 
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is  unimodal  on  [-1,1]  by  our  definition,  but  not  by  Wilde's,  for  x 
does  not  exist. 

The  following  theorem  gives  a  simple  characterization  of  unimodality. 
There  is  no  assumption  that  f  is  continuous.  Since  a  strictly  monotonic 
function  (e.g.,  xr  )  may  have  stationary  points,  the  theorem  shows  that 
both  our  definition  and  Wilde’s  are  essentially  different  from  Kowalik 
and  Osborne’s,  even  if  f  is  continuously  differentiable.  (Although 
this  point  is  obvious,  it  is  sometimes  overlooked'.  See  also  Corollary  3*3 •) 

Theorem  3.1 

f  is  unimodal  on  [e,b]  (according  to  Definition  3*1)  iff,  for  seme 
(unique)  pe[a,b]  ,  either  f  is  strictly  monotonic  decreasing  in  [a,p) 
and  strictly  monotonic  increasing  in  [p,b]  ,  or  f  is  strictly  monotonic 
decreasing  in  [a,p]  and  strictly  monotonic  increasing  in  (p,b]  . 

The  theorem  is  a  special  case  of  Theorem  3*2  below,  so  the  proof  is 
omitted.  The  following  corollaries  are  immediate. 

Corollary  3.1 

If  f  is  unimodal  on  [a, b]  ,  then  f  attains  its  least  value  at 
most  once  on  [a,  b]  .  (If  f  attains  its  least  value,  then  it  must 
attain  it  at  the  point  p  given  by  Theorem  3.1.) 


Corollar>r  3.2 

If  f  is  unimodal  and  continuous  on  [a,b]  ,  then  f  attains  its 


least  value  exactly  once  on  (a,bl  . 
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Corollary  3,5 


If  f€C^[a,b]  then  f  is  unimodal  iff,  for  some  [ie[a,b]  , 
f  *  <  0  almost  everywhere  on  [a,u]  and  f  >  0  almost  everywhere 
on  [n,b]  .  (Note  that  f*  may  vanish  at  a  finite  number  of  points.) 


Fibonacci  and  golden  section  search 

If  f  is  unimodal  on  [a,b]  ,  then  the  minimum  of  f  (or,  if 
the  minimum  is  not  attained,  the  point  n  given  by  Theorem  3*1)  can  be 
located  to  any  desired  accuracy  by  the  well-known  methods  of  Fibonacci 
search  or  golden  section  search.  The  reader  is  referred  to  Wilde  (1964) 
for  an  excellent  description  of  these  methods.  (See  also  Boothroyd 
(1965a,  b),  Johnson  (1955) ,  Krolak  (1968) ,  Newman  (1965),  Pike  and  Pixner 
(1967),  and  Witzgall  (1969).)  Care  should  be  taken  to  ensure  that  the 
coordinates  of  the  points  at  which  f  is  evaluated  are  computed  in  a 
numerically  stable  way  (see  Overholt  (1965)).  Fibonacci  and  golden  section 
search,  as  well  as  similar  but  less  efficient  methods,  are  based  on  the 
following  result,  which  shows  how  the  interval  known  to  contain  (a  may 
be  reduced  in  s?ze. 

Corollary  5.4 

Suppose  that  f  is  unimodal  on  [a, b]  ,  p  is  the  point  given  by 
Theorem  3-1,  and  a  <  x^  <  xg  <  b  .  If  f(x^)  <  f(x2)  then  ^  <  x^  , 

and  if  f(x^)  >  f(x^)  then  ju  >  . 
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Proof 


If  x2  <  (i  then,  by  Theorem  3*1*  f(x^)  >  f(x2)  •  Thus,  if 

f(x^)  <  f(x2)  then  (i  <  Xg  .  The  other  half  follows  similarly. 

If  the  reader  is  prepared  to  ignore  the  problem  of  computing 
"unimodal"  functions  using  limited-precision  arithmetic,  he  may  skip  the 
rest  of  this  section. 

5 -unimodality 

As  was  pointed  out  at  the  end  of  Section  2,  functions  computed  using 
limited-precision  arithmetic  will  not  be  unimodal  because  of  rounding 
errors.  Thus,  the  theoretical  basis  for  Fibonacci  search,  golden  section 
search,  and  similar  methods,  is  irrelevant,  and  it  is  not  clear  that  these 
methods  will  give  even  approximately  correct  results  in  the  presence  or 
rounding  errors.  To  analyze  this  problem,  we  generalize  the  idea  of 
unimodality  to  5 -unimodality.  Intuitively,  6  is  a  nonnegative  number 
such  that  Fibonacci  or  golden  section  search  will  give  correct  results, 
even  though  f  is  not  necessarily  unimodal  (unless  5=0),  provided 
that  the  distance  between  points  at  which  f  is  evaluated  is  always 
greater  than  6  .  The  results  of  Section  2  indicate  how  large  &  is 
likely  to  be  in  practice.  (Our  aim  differs  from  that  of  Richman  (1968)  in 
defining  the  e -calculus,  for  he  is  interested  in  properties  that  hold  as 
e  -*  0  .)  For  anothei  approach  to  the  problem  of  rounding  errors,  jee 
Overholt  (1967) . 

In  the  remainder  of  this  section,  6  is  a  fixed  nonnegative  number. 

As  well  as  5 -unimodality,  we  need  to  define  6 -monotonicity.  If  5=0 
then  6-unimodality  and  6-monotoni _ity  reduce  so  unimodality  (Definition  3.1) 


-17 


5-3 


and  monotonicity. 

Definition  5.2 

Let  I  be  an  interval  and  f  a  real-valued  function  on  I  .  We 

say  that  f  is  strictly  6 -monotonic  increasing  on  I  if,  for  all 

€  I  , 

V&  <  x2  3  ffx^  <  f(x2)  .  (3-11) 

As  an  abbreviation,  we  shall  write  simply  "  f  is  6-t  on  I  " . 
Strictly  5-monotonic  decreasing  functions  (abbreviated  6-i)  are  defined 
in  the  obvious  way. 

Definition  3.3 

Let  I  be  an  interval  and  f  a  real- valued  function  on  I  .  We 

say  that  f  is  6-unimodal  on  I  if,  for  all  x^x^x^  e  I  , 

Xq+6  <  xx  A  xx+6  <  x2  3  (f(xQ)  <  f(xx)  3  f(xx)  <  f(x2)) 

A  (f(Xl)  >  f(x2)  3  f(x0)  >  f(xx))  .  (3.12) 

The  following  theorem  gives  a  characterization  of  5-unimodal  functions. 
It  reduces  to  Theorem  3*1  if  5=0. 

Theorem  3.2 

f  is  5-unimodal  on  (a, b]  iff  there  exists  ne[a,b]  such  that 
eith<  r  f  is  5-i  or.  (a, p)  and  5-T  on  [^,b]  ,  or  f  is  5-1 
on  [a,^l  and  5-*  on  (n,b]  .  furthermore,  if  f  is  5-unLnodal  cn 
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[a,bj  ,  then  there  is  a  unique  interval  [ ]  c  [a,b]  such  that 
the  points  p  with  the  above  properties  are  precisely  the  elements  of 
>  and  pg  <  n1+  5  . 

Proof 

Suppose  p  exists  so  that  f  is  5-i  on  [a, p)  and  5-t  on  [p,b] 
Take  any  xQ,  x^,  x2  in  [a,b]  with  xQ+5  <  x^  and  x^+S  <  x0  .  If 
f(xQ)  <  f(x^)  then,  since  f  is  5-i  on  [a,p)  ,  p  <  x^  .  As  f  is 

5-T  on  [p, b)  ,  it  follows  that  f(x^)  <  f(.<2)  •  The  other  cases  are 

similar,  so  f  is  5-unimodal. 

Conversely,  suppose  that  f  is  5-unimodal  on  [a,b]  .  Let 

Px  =  inf{xe[a,b]  |  f  is  5-t  on  [x,b]}  ,  (3.13) 

(so  p.^  <max(a,b-6))  ,  and 

P2  =  sup{xt[a,b ]  j  f  is  5-i  on  [a,x]j  ,  (3.1*0 

(so  p2  >  min(a-!-G,b))  . 

It  is  immediate  from  the  definitions  (3-13)  and  (3.1*+)  that  f  is 
5-t  on  (p^,b]  and  f  is  S-i  on  [a,p2)  .  We  shall,  show  that 

<  P2  •  (3.15) 

Suppose,  by  way  of  contradiction,  that 

P]_  >  Pg  •  (3.16) 

This  implies  that  p^  >  a  and  p2  <  b  ,  so,  from  the  definitions  of  p^ 
and  p2  ,  there  are  points  xf  and  x"  with 
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fi2  <  x"  < 


^1+  ^2 


<  x*  <  u 


(3.17) 


such  that  f  is  not  6-t  on  [x',b]  and  f  is  not  5-1  on  [a,x”  ]  . 
Thus,  there  are  points  y*  ,  y"  ,  z*  ,  z"  in  [a,b]  such  that 


z"+5  <  y”  <  x"  <  x*  <  y*  <  z*-5  ,  (5.18) 

f(z")  <  f(y»)  ,  (J.19) 


and 


f(y')  >  f(z») 


(3.20) 


Let  xQ  =  z"  ,  x2  =  z’  ,  and 


X1  = 


y*  if  f(y')  >  f  (y,r) 


otherwise 


(3.21) 


From  relations  (3.18)  to  (5*21),  the  points  xQ,  x^  and  x2  contradict 
5 -unimodality  (equation (3*12)) .  Thus  (3.16)  is  impossible,  (3.15)  must 
hold,  and  [ ]  is  nonempty. 

Choose  any  pi  in  [ |i^, ]  .  From  the  definitions  of  and  n2  , 

f  is  5-1  on  [a,n)  and  8-t  on  (ji,b]  .  Suppose,  by  way  of  contradiction, 
that  f  is  neither  5-i  on  [a,n]  nor  8-t  on  [|i,b]  .  Then  there 
axe  points  y^  and  y2  in  [a,b]  such  that 


y2+5  <  <  y.^-5  , 

(3.22) 

f(yx)  <  f(n)  , 

(3.23) 

and 
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f(y2)  <  *(n)  • 


(3-24) 


Thus,  the  points  y2  ,  p  ,  and  contradict  the  6 -unimodality  of  f  , 

so  f  is  either  5-1  on  [a,p]  or  S-t  on  [p,b]  .  This  completes 
the  proof  of  the  first  part  of  the  theorem. 

Finally,  by  the  definitions  (3.13)  and  (3.1*0,  the  set  of  points  p 
satisfying  the  conditions  of  the  theorem  is  precisely  lp^,p2]  .  Since 
f  is  both  S-t  and  5-i  on  (p^,p2)  t  we. have  p2  <  p^+5  ,  and  the 
proof  is  complete. 

Remarks 

The  interval  [p^,p2]  depends  on  5  .  Suppose  that  f  attains  its 

minimum  in  [a,b]  at  p  .  By  Theorem  3.2,  f  is  5-t  on  (p^b] 

and  6-i  on  [a,p2)  ,  so  pe  [p2-6,p^+5]  ,  an  interval  of  length  at 
most  25  . 

As  an  example,  consider 

f(x)  -  x2  +  e .g(x)  (3.25) 

on  [-1,1]  ,  where  g  is  any  function  (not  necessarily  continuous)  with 

|g(x)  {  <  1  ,  and  e  >  0  .  Since  f(x)  is  bounded  above  and  below  by  the 

2  2 

unimodal  functions  r  +  e  and  x  -e  ,  we  see  that  f  is  6-unimodal  if 
b>  /i7  .  In  a  practical  case  e  might  be  (a  small  multiple  of)  the 

relative  machine  precision,  and  the  fact  that  the  least  5  for  which  f 

1/2 

is  6-unimodal  is  of  order  e  '  ,  rather  than  e  ,  is  to  be  expected  from 

the  discussion  in  Section  2. 
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two  function  evaluations  giving  I .  were  at  points  separated  by  more 

J 

than  SQ  .  The  smallest  such  interval  1^.  has  length  no  greater  than 
(2  +  /5)6q  ,  so 

<  (?  +  /5)&0  -  5*236 SQ  .  (3.26) 

Thus,  golden  section  search  gives  an  approximation  £  which  is  nearly 
as  good  as  could  be  expected  if  we  knew  SQ  .  This  may  be  regarded  as 
a  justification  for  using  golden  section  (or  Fibonacci)  search  to  approximate 
minima  of  functions  which,  because  of  rounding  errors,  are  only  "approximately" 
unimodai. 


4.  An  algorithm  analogous  to  pekkerfs  algorithm 

For  finding  a  zero  of  a  function  f  ,  the  bisection  process  has  the 
advantage  that  linear  convergence  is  guaranteed,  as  the  interval  known  to 
contain  a  zero  is  halved  at  each  evaluation  of  f  after  the  first. 

However,  if  f  is  sufficiently  smooth  and  we  have  a  good  initial 
approximation  to  a  simple  zero,  then  a  process  with  superlinear  convergence 
will  be  much  faster  them  bisection.  This  is  the  motivation  for  the 
algorithm,  described  in  Chapter  4,  which  combines  bisection  and  successive 
linear  interpolation  in  a  way  which  retains  the  advantages  of  both. 

There  is  a  clear  analogy  between  methods  for  finding  a  minimum  and 
for  finding  a  zero.  The  Fibonacci  and  golden  section  search  methods  have 
guaranteed  linear  convergence,  and  correspond  to  bisection.  Processes 
like  successive  parabolic  interpolation,  which  do  not  always  converge,  but 
under  certain  conditions  converge  superlinearly,  correspond  to  successive 
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linear  interpolation.  In  this  section  we  describe  an  algorithm  which 
combines  golden  section  search  and  successive  parabolic  interpolation 
in  a  way  which  retains  the  advantages  of  both.  The  analogy  with  the 
algorithm  of  Chapter  4  is  illustrated  in  Diagram  4.1. 


Zeros 


Extrema 


Linear  convergence 


Bisection 


Golden  section  search 


Super  linear  convergence  Successive  linear  < — »  Successive  parabolic 

interpolation  interpolation 


Diagram  4.1:  The  analogy  between  algorithms  for 
finding  zeros  and  extrema 


Many  more  or  less  "ad  hoc"  algorithms  have  been  proposed  for  one¬ 
dimensional  minimization,  particularly  as  components  of  n-dimensional 
minimization  algorithms.  See  Box,  Davies  and  Swann  (1969),  Flanagan, 

Vitale  and  Mendelsohn  (1969),  Fletcher  and  Reeves  (1964),  Jacoby, 

Kowalik  and  Pizzo  (1971),  Kowalik  and  Osborne  (1968),  Pierre  (1969), 

Powell  (1964),  etc.  The  algorithm  presented  here  might  be  regarded  as 
an  unwarranted  addition  to  this  list,  but  it  seems  to  us  to  be  more 
natural  than  these  algorithms,  which  involve  arbitrary  prescriptions  like 
"if  ...  fails  then  halve  the  step-size  and  try  again".  Of  course,  our 
algorithm  is  not  quite  free  of  arbitrary  prescriptions  either,  so  a  more 
objective  criticism  of  the  "ad  hoc"  algorithms  is  that  for  many  of  them 
convergence  to  a  local  minimum  in  a  reasonable  number  of  function  evaluations 
can  not  be  guaranteed,  and,  for  the  exceptions,  the  asymptotic  rate  of 
convergence  if  f  is  sufficiently  smooth  is  less  than  for  our  algorithm 
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(see  Section  5)-  Note  that  we  do  not  claim  that  our  algorithm  is 
suitable  for  use  in  an  n-dimensional  minimization  procedure:  an  "ad  hoc" 
algorithm  may  bo  more  efficient  (see  Sections  1  and  7*1)* 

A  description  of  the  algorithm 

Here  we  give  an  outline  which  should  make  the  main  ideas  of  the 
algorithm  clear.  For  questions  of  detail  the  reader  should  refer  to 
Section  8,  where  the  algorithm  is  described  formally  by  the  ALGOL  60 
procedure  localmin. 

The  algorithm  finds  an  approximation  to  the  minimum  of  a  function  f 
defined  on  the  interval  [a,b]  .  Unless  a  is  very  close  to  b  ,  f  is 
never  evaluated  at  the  endpoints  a  and  b  ,  so  f  need  only  be  defined 
on  (a,b)  ,  and  if  the  minimum  is  actually  at  a  or  b  then  an  interior 
point  distant  no  more  than  2.tol  from  a  or  b  will  be  returned, 
where  tol  is  a  tolerance  (see  equation  (4.2)  belcw) .  The  minimum  found 
may  be  local,  but  non-global,  unless  f  is  8-unimodal  for  some  6  <  tol  . 

At  a  typical  step  there  are  six  significant  points  a,b,u,v,w, 
and  x  ,  not  all  distinct.  The  positions  of  these  points  change  during 
the  algorithm,  but  there  should  be  no  confusion  if  we  omit  subscripts. 
Initially,  (u,b)  is  the  interval  on  which  f  is  defined,  and 

v  =  w  =  x  =  a+  (-  ~  y/^)(b-a)  .  (4.1) 

3-/5 

(The  magic  number  — 2 — ~  ~  O.381966...  is  rather  arbitrarily  chosen  so 

that  the  first  step  is  the  same  as  for  a  golden  section  search.) 

At  the  start  of  a  cycle  (label  "loop"  of  procedure  localmin)  the 
points  a  ,  b  ,  u  ,  v  ,  v  ,  and  x  always  serve  as  follows:  a  local 
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minimum  lies  in  [a, b]  ;  of  all  the  points  at  which  f  has  been  evaluated, 
x  is  the  one  with  the  least  value  of  f  ,  or  the  point  of  the  most  recent 
evaluation  if  there  is  a  tie;  w  is  the  point  with  the  next  lowest  value 
of  f  ;  v  is  the  previous  value  of  w  ;  and  u  is  the  3»,st  point  at 
which  f  has  been  evaluated  (undefined  the  first  time) .  One  possible 
configuration  is  shown  in  Diagram  4.2. 


Diagram  4.2:  A  possible  configuration 


As  jn  procedure  zero  (Chapter  4),  the  tolerance  is  a  combination  of 


a  relative  and  an  absolute  tolerance.  If 


tol  =  eps .  |  x  j  +  t  , 


(4.2) 


then  the  point  x  returned  approximates  a  minimum  to  an  accuracy  of 

2.tol  +  6  <  3. tol  ,  if  f  is  S-unimodal  near  x  and  5  <  tol  .  The 

user  must  provide  the  positive  parameters  eps  and  t  .  In  view  of  the 

discussion  in  Section  2,  it  is  generally  unreasonable  to  take  eps  much 
l/2 

less  than  s'",  where  e  is  the  machine-precision  (see  Section  4.2). 
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t  should  be  positive  in  case  the  minimum  is  at  0  .  It  is  possible  that 
the  error  may  exceed  2.tol  +  5  because  of  the  effect  of  rounding  errors 
in  determining  if  the  stopping  criterion  is  satisfied,  but  the  additional 
error  is  of  order  e|x|  ,  which  is  negligible  if  tol  is  of  order 
e^^Jx|  or  greater. 

Let  m  =  i  (a+b)  be  the  midpoint  of  the  interval  known  to  contain 
the  minimum.  If  )x-m |  <2.tol-^  (b-a)  ,  i.e.,  if  max(x-a,  b-x)  <  2. tol  , 
then  the  procedure  terminates  with  x  as  the  approximate  position  of  the 
minimum.  Otherwise,  numbers  p  ard  q  (q  >  0)  are  computed  so  that 
x  +  p/q  is  the  turning  point  of  the  parabola  passing  through  (v,  f(v))  , 

(w,  f(w))  ,  and  (x, f(x))  .  .  If  two  or  more  of  these  points  coincide,  or  if 
the  parabola  degenerates  to  a  straight  line,  then  q  =  0  . 

p  and  q  are  given  by 

P  =  4.1(X-V)2(f(x).f(w))  -(x-w)2(f(x)-f(v))]  (k.3) 

=  +  (x-v)(x-w)(v-v){(x-w)f[v,w,x]  +  f[w,x]}  ,  (4.4) 

and 

q  =  +2[ (x-v) (f(x) -f(w))  -  (x-w)(f(x)-f(v))  ]  (4.5) 

=  +2(x-v)  (x-w)  (w-v)  f[v,  .t,x]  .  (4.6) 

Fran  (4.4)  and  (4.6),  the  correction  p/q  should  be  small  if  x  is  close 
to  a  minimum  where  the  second  derivative  is  positive,  so  the  effect  of 
rounding  errors  in  computing  p  and  q  is  minimized.  (Golub  and  Smith 
(1967)  compute  a  correction  to  ^(v+w)  for  the  same  reason.) 

As  in  procedure  zero,  let  e  be  the  value  of  p/q  at  the  second-last 
cycle.  If  |ej  <  tol  ,  q  =  0  ,  x  +  p/q  /  (a,b)  ,  or  jp/q j  >  ^|e|  ,  then 
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a  "golden  section"  step  is  performed,  i.e.,  the  next  value  of  u  is 


u 


-)x  +  — )a  if  x  >  m 

(^-f- i)x  +  (2-^2)b  if  x  <  m 


J 


(4.7) 


(An  optima?,  choice  in  the  limit:  sec  Witzgall  (1969).)  Otherwise  u  is 
taken  as  x+p/q  (a  "parabolic  interpolation"  step),  exctpt  that 
the  distances  ju-x. |  ,  u-a  and  b-u  must  be  et  least  tol  .  Then  f 
is  evaluated  at  the  new  point  u  ,  the  points  a  ,  b  ,  v  ,  w  and  x 
are  updated  as  necessary,  and  the  cycle  is  repeated  (the  procedure 
returns  to  the  label  "loop") .  We  see  that  f  :s  never  evaluated  at 
two  points  closer  together  than  tol  ,  so  5 -unimodality  for  some  6  <  tol 
is  enough  to  ensure  that  the  global  minimum  is  found  to  an  accuracy  of 
2.tol  +  6  (see  Theorem  3-5  and  the  following  remarks). 

Typically  the  algorithm  terminates  in  the  following  way:  x  =  b  -  tol 

(or,  symmetrically,  a+tol)  after  a  parabolic  interpolation  step  has  been 
performed  with  the  condition  |u-x|  >  tol  enforced.  The  next  parabolic 
interpolation  point  lies  very  close  to  x  and  b  ,  so  u  is  forced  to 
be  x  -  tol  .  If  f(u)  >  f(x)  then  a  moves  to  u  ,  b-a  becomes  2 -tol  , 
and  the  termination  criterion  is  satisfied  (see  Diagram  4.3).  Note  that 
two  consecutive  steps  of  tol  are  done  just  before  termination.  If  a 
golden  section  search  were  done  whenever  the  last,  rather  than  second-last, 
value  of  Jp/ q  |  was  tol  or  less,  then  termination  with  two  consecutive 
steps  of  tol  would  be  prevented,  and  unnecessary  golden  section  steps 
would  be  performed. 
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Diagram  4.3:  A  typical  situation  after  termination 


5.  Convergence  properties 

There  can  not  be  more  than  about  2.1og2((b-a)/tol)  consecutive 
parabolic  interpolation  steps  (with  the  current  a  and  b  ,  and  the 
minimum  of  tol  over  the  interval),  for  while  parabolic  interpolation 
steps  are  being  performed  | p / q  |  decreases  by  a  factor  of  at  least  two 
on  every  second  cycle  of  the  algorithm,  and  when  je|  <  tol  a  golden 
section  step  is  performed.  (In  this  section,  "about"  means  we  are  not 
distinguishing  between  a  real  number  and  its  integer  part.)  A  golden 
section  step  does  not  necessarily  decrease  b-a  significantly,  e.g., 
if  x  =  b  -  tol  and  f(u)  <  f(x)  ,  then  b-a  is  only  decreased  by  tol  , 
b1  c  two  golden  section  steps  must  decrease  b-a  by  a  factor  of  at  least 
’  +  /s 

— — — 2  =  1.618...  .  As  in  Section  4.3,  we  see  that  convergence  can  not 
require  more  than  about 
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2X(log2(^))2  (5.1) 

function  evaluations,  where 

K  =  l/log2{—~5)  =  1.44...  .  (5.2) 

By  comparison,  a  golden  section  or  Fibonacci  search  would  require  about 

lC.log2(|§f)  (5-3) 

b”8. 

function  evaluations,  and  a  brute-force  search  about  — -  . 

2.tol 

The  analogy  with  procedure  zero  of  Chapter  4  should  be  clear,  and 
essentially  the  same  remarks  apply  here  as  were  made  in  Chapter  4.  In 
practical  tests  convergence  has  never  been  more  than  5  percent  slower 
than  for  a  Fibonacci  search  (see  Section  6) . 

In  deriving  (5*1)  we  have  ignored  the  effect  of  rounding  errors  inside 
the  procedure,  but  it  is  easy  to  see  (as  in  Section  4.2)  that  they  can  not 
prevent  convergence  if  floating-point  operations  satisfy  (4.2.10)  and  (4.2.11), 
provided  the  parameter  eps  of  procedure  localmin  is  at  least  2e  . 


Guperlinear  convergence 


If  f  is  C  near  an  interior  minimum  p  with  f’'(|i)  >0  ,  then 
Theorem  3.4.1  shows  that,  while  rounding  errors  are  negligible,  convergence 


will  be  super  linear.  Usually  the  algorithm  stops  doing  golden  section  steps, 
and  eventually  does  only  parabolic  interpolation  steps,  with  f(x)  decreasing 
at  each  step,  until  the  tolerance  comes  into  play  just  before  termination. 

This  is  certainly  true  if  the  successive  parabolic  interpolation  process 
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converges  with  strong  order  ^  =  I.52U7...  (sufficient  conditions  for 
this  are  given  in  Sections  3.6  and  5-7) • 

For  most  of  the  "ad  hoc"  methods  given  in  the  literature,  convergence 
with  a  guaranteed  error  hound  of  order  tol  in  the  number  of  steps  given 
by  (5*1)  is  not  certain,  and,  even  if  convergence  does  occur,  the  order 
is  no  greater  than  for  our  algorithm.  For  example,  the  algorithm  of 
Davies,  Swann  and  Campey  (see  Box,  Davies  and  Swann  (1969))  evaluates  f 
at  two  or  more  points  for  each  parabolic  fit,  so  the  order  of  convergence 
is  at  most  =  1.150...  (excluding  exceptional  cases). 


>, 


6 


6.  Practical,  tests 

The  ALGOL  procedure  localmin  given  in  Section  8  has  been  tested  using 

ALGOL  W  (Wirth  and  Hoare  (1966),  Bauer,  Becker  and  Graham  (1968))  on  an 

-13 

IBM  360/67  and  a  360/91  with  a  machine  precision  of  l6  .  Although  it 
might  be  possible  to  contrive  an  example  where  the  bound  (5*1)  on  the 
number  of  function  evaluations  is  nearly  attained,  for  our  test  cases 
convergence  never  requires  as  many  as  5  percent  more  function  evaluations 
than  would  be  needed  to  guarantee  the  same  accuracy  using  Fibonacci  search. 
In  most  practical  cases  superlinear  convergence  sets  in  after  a  few  golden 
section  steps,  and  the  procedure  is  much  faster  than  Fibonacci  search. 

As  an  example,  in  Table  6.1  we  give  the  number  of  function  evaluations 
required  to  find  the  minima  of  the  function 


f(x) 


2  2  2 

This  function  has  poles  at  x  =  1  ,2  ,...,20 


(6.1) 


Restricted  to  the  open 


Ijl 


2  2 

interval  (i  ,  (i+l)  )  for  i  =  1,2,  .  ..,19  it  is  unimodal  (ignoring 

rounding  errors)  with  an  interior  minimum.  The  fourth  column  of  Table  6.1 

gives  the  number  of  function  evaluations  required  to  find  this 

-7  -10 

minimum  ,  using  procedure  localmin  with  eps  =  16  and  t  =  10 

—7  -10 

(so  the  error  bound  is  less  than  3*tol  ,  where  tol  =  16  .  |xj+10~  ). 

The  last  column  of  the  table  gives  the  number  n_  of  function 

Z 

evaluations  required  to  find  the  zero  of 


f’(x)  = 


-a-  l 

i=l  (x-12)3 


(6.2) 


r  2  -9  2-Q 

in  the  interval  [i  +10  ,  (i+l)  -10  y ]  ,  using  procedure  zero  (Section 

-7  -10 

4.6)  with  macheps  =  16  and  t  =  10  ,  so  the  guaranteed  accuracy  is 

nearly  the  same  as  for  localmin.  Of  course,  in  practical  cases  we  would 
seldom  be  lucky  enough  to  have  such  a  simple  analytic  expression  for  f'  , 
so  procedure  zero  could  not  easily  be  used  to  find  minima  of  f  in  this 
manner.  Also,  procedure  zero  could  find  a  maximum  rather  than  a  minimum. 

Table  6.1  shows  that  the  number  of  function  evaluations  required  by 
procedure  localmin  compares  favorably  with  the  number  required  by  procedure 
zero.  Both  are  much  faster  than  Fibonacci  search,  which  would  require  4 5 
function  evaluations  to  find  the  minimum  for  i  =  10  to  the  same  accuracy. 

For  some  numerical  results  illustrating  the  superlinear  convergence 
of  the  successive  parabolic  interpolation  process,  see  Section  3 -9* 


r  a 


•  c  <  « 


•  (  * 


5.6 

Table  6.1:  Comparison  of  procedures  localmin  and  zero 


!  i 

“i 

nL 

— 1 

nz 

1 

3.0229153 

5.6766990169 

12 

14  1 

2 

6.6837536 

I.HI85OOIOO  ' 

11 

8  i 

t 

5 

11.2387017 

1.2182217657 

13 

14 

i  ■* 

19.6760001 

2.1621105109 

10 

12 

1  5 

i 

29.8282273 

; .0322905193 

i 

11 

12 

}  6 

41.9061162 

\ 

3.7583856477  | 

! 

11 

11 

i  7 

55.9535958 

4.5554103836  ' 

10 

11 

8 

71.9856656 

4.8482959563 

10 

11 

:  9 

90.0088685 

5.2587585400 

10 

10 

10 

110.0265527 

5.6036524295 

10 

10 

;  11 

132.0405517 

5.8956037976  ; 

t 

10 

10 

12 

\ 

156.0521144 

6.1438861542  j 

9 

10 

13 

182.0620604 

6.3550764595  J 

9 

10  1 

1 

i4 

210.0711010 

6.5333662005  | 

9 

I 

10  1 

1 

i  15 

! 

24o .0800483 

6.6803659849  j 

9 

10 

i  16 

272.0902669 

6.7938538565  i 

9 

10 

;  17 

306.1051253 

6.8654981053  , 

9 

10 

;  18 

1 

342.1369451- 

1 

6.8539024651  | 

9 

9 

L19 

580.2687097 

1 

6. 6008470481  j 

....  i 

9 

9 

For  a  discussion  and  definition  of  the  terms,  see  above. 
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7 .  Conclusion 

The  algorithm  given  in  this  chapter  has  the  same  advantages  as  the 
algorithm  described  in  Chapter  4  for  finding  zeros:  convergence  in  a 
reasonable  number  of  steps  is  guaranteed  for  any  function  (see  equation 
(5*1))  >  and  on  well-behaved  functions  convergence  is  superlinear,  with 
order  at  least  1.5247...  ,  and  thus  much  faster  than  Fibonacci  search. 
There  is  no  contradiction  here:  Fibonacci  search  is  the  fastest  method 
for  the  worst  possible  function,  but  our  algorithm  is  faster  on  a  large 
class  of  functions  (including,  for  example,  C  functions  with  positive 
second  derivatives  at  interior  minima) . 


A  similar  algorithm  using  derivatives 

We  pointed  out  in  Section  4.5  that  bisection  could  be  combined  with 
interpolation  formulas  which  use  both  f  and  f*  .  We  could  combine 
golden  section  search  with  an  interpolation  method  using  both  f  and  f* 
in  a  similar  way.  Davidon  (1959)  suggests  fitting  a  cubic  polynomial  to 
agiee  with  f  and  f*  at  two  points,  and  taking  a  turning  point  of  the 
cubic  as  the  next  approximation.  (See  also  Johnson  and  Myers  (1967).)  This 
method,  which  gives  the  possibility  of  superlinear  convergence,  coulci  well 
replace  successive  parabolic  interpolation  (using  f  at  three  points)  in 
our  algorithm  if  f’  is  easy  to  compute.  If  the  cubic  has  no  real  turning 
point,  or  if  the  turning  point  which  is  a  local  minimum  lies  outside  the 
interval  known  to  contain  a  minimum  of  f  ,  then  we  can  resort  to  golden 
section  search. 


5.8 

Parallel  algorithms 


So  far  we  have  considered  only  serial  (i.e.,  sequential)  algorithms 
for  finding  minima.  If  a  parallel  computer  is  available,  more  efficient 
algorithms  which  take  advantage  of  the  parallelism  are  possible,  just  as 
in  the  analogous  zero-finding  problem  (see  Section  4.5)*  Karp  and 
Miranker  (1968)  give  a  parallel  search  method  which  is  a  generalization  of 
Fibonacci  search  (and  optimal  in  the  same  sense,  if  a  sufficiently  parallel 
processor  is  available).  See  also  Wilde  (1964)  and  Avriel  and  Wilde  (1966). 
Miranker  (1969)  gives  parallel  methods  for  approximating  the  root  of  a 
function,  and  these  could  be  used  to  find  a  root  of  f  (or  parallel 
methods  for  finding  a  root  of  ff  ,  using  only  evaluations  of  f  ,  could 
be  used) .  These  parallel  methods  could  be  combined,  in  much  the  same  way 
as  we  have  combined  golden  section  search  and  successive  parabolic 
interpolation,  to  give  a  parallel  method  with  guaranteed  convergence, 
and  often  superlinear  convergence  with  a  higher  order  than  for  our  serial 
method . 


8.  An  ALGOL  60  procedure 

The  ALGOL  procedure  localmin  for  finding  a  local  minimum  of  a  function 
of  one  variable  is  given  below.  The  algorithm  and  some  numerical  results 
are  described  in  Sections  4  to  6. 

Procedure  localmin 

real  procedure  localmin  (a,  b,  eps,  t,  f,  x) ; 

value  a,  b,  eps,  t;  real  a,  b,  eps,  t,  x;  real  procedure  f; 
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begin  comment : 

If  the  function  f  is  defined  in  the  interval  (a,b),  then  localmin 
finds  an  approximation  x  to  the  point  at  which  f  attains  its  minimum 
(or  the  appropriate  limit  point),  and  returns  the  value  of  f  at  x. 
t  and  eps  define  a  tolerance  tol  =  eps.|x|+t,  and  f  is  never  evaluated 
at  two  points  closer  together  than  col.  If  f  is  5-unimodal  (see 
Definition  3*3),  for  some  5  <  tol,  then  x  approximates  the  global 
minimum  of  f  with  an  error  of  less  than  3. tol  (see  Section  4).  If 
f  is  not  5-unimodal  on  (a,b),  then  x  may  approximate  a  local,  but 
non-global,  minimum,  eps  should  be  no  smaller  than  2.macheps,  and 
preferably  not  much  less  than  sqrt (macheps) ,  where  macheps  is  the 
relative  machine  precision  (Section  4.2).  t  should  be  positive.  For 
further  details,  see  Section  2. 

The  method  used  is  a  combination  of  golden  section  search  and 
succession  parabolic  interpolation.  Convergence  is  never  much  slower 
than  for  a  Fibonacci  search  (see  Sections  5  and  6).  If  f  has  a  continuous 
second  derivative  which  is  positive  at  the  minimum  (not  at  a  or  b)  then, 
ignoring  rounding  errors,  convergence  is  superlinear,  and  usually  the 
order  is  at  least  1.3247...; 

real  c,  d,  e,  m,  p,  q,  r,  tol,  t2,  u,  v,  w,  fu,  fv,  fw,  fx; 

c  :=  0.381966011250105151795413165634;  comment ;  c  =  (3  -  sqrt(5) )/2; 

v  :=  w  :=  x  :=  a+  c  x  (b-a) ;  e  :=  0; 

fv  :=  fw  fx  :=  f(x) ; 

comment :  Main  loop; 

loop:  m  :-0.5x(a+b); 

tol  :=  eps  x  abs(x)+t;  t2  :=  2  x  tol; 
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comment:  Check  stopping  criterion; 
if  abs(x-m)  >  t2  -  0.5  X  (h-a)  then 
begin  p  :=  q  :=  r  :  =  0; 
if  abs(e)  >  tol  then 

begin  comment:  Fit  parabola; 
r  :=  (x-w)  x  (fx-fv);  q  :=  (x-v)  x  (fx-fw); 
p  :=  (x-v)  x  q-(x-w)  x  r;  q  :=  2  x  (q-r) ; 
if  q  >  0  then  p  :=  -p  else  q  :=  -q; 
r  :=  e;  e  :=  d 
end; 

if  abs(p)  <  abs(0.5  xqxr)  Ap>qx  (a-x)  A  p  <  q  x  (b-x)  then 
begin  comment :  A  "parabolic  interpolation"  step; 
d  :=  p/q;  u  :=  x+d; 

comment :  f  must  not  be  evaluated  too  close  to  a  or  b; 
if  u-a  <  t2  v  b-u  <  t2  then  d  :=  if  x  <  m  then  tol  else  -tol 
end 
else 

begin  comment :  A  "golden  section"  step; 
e  :  =  ( if  x  <  m  then  b  else  a)  -x ;  d  :  =  c  x  e 
end; 

comment:  f  must  not  be  evaluated  too  close  to  x; 
u  :=  x+  (if  abs(d)  >  tol  then  d  else  if  d  >  0  then  tol  else  -tol) 
fu  : =  f (u) ; 

comment :  Update  a,  b,  v,  v  and  x; 


if  fu  <  fx  then 


5.8 


begin  if  u  <  x  then  b  :=  x  else  a  :=  x; 
v  :=  w;  fv  :=  fV;  w  :=  x;  fv  :=  fx;  x  :=  u;  fx  :=  fu 
end 
else 

begin  if  u  <  x  then  a  :  =  u  else  b  :=  u; 
iffu<fwvw  =  x  then 

begin  v  :=  w;  fv  :=  fV;  w:=u;  fw:=fu  end 
else  iffu<fvvv-xvv=w  then 
begin  v  :=  u;  fv  fu 
end 
end; 

go  to  loop 
end; 

localmin  :=  fx 
end  localmin; 


Chapter  6. 


Global  Minimization  Given  an  Upper  Bound  on  the 
Second  Derivative 


6.1 


1.  Introduction 

Minimization  procedures  like  the  one  described  in  Chapter  5  can 
only  guarantee  to  find  a  local,  not  necessarily  global,  minimum  of  a 
function  feC[a,  b]  .  If  f  happens  to  be  unimodal  then  a  local 
minimum  must  be  the  global  minimum  in  [a,b]  ,  but  in  practical  problems 
it  often  happens  that  f  is  not  unimodal,  or  that  unimodality  is  difficult 
to  prove.  In  this  chapter  we  investigate  the  problem  of  finding  a  good 
approximation  to  the  global  minimum,  given  weaker  conditions  on  f  than 
unimodality.  As  usual,  we  consider  methods  which  depend  on  the  sequential 
evaluation  of  f  at  a  finite  number  of  points,  and  our  aim  is  to  reduce, 
as  far  as  possible,  the  number  of  function  evaluations  required  to  give 
an  answer  which  is  guaranteed  to  be  accurate  to  within  some  prescribed 
tolerance. 

In  Sections  2  to  6  we  describe  an  efficient  algorithm  for 
approximating  the  global  minimum  of  a  function  of  one  variable,  given  an 
upper  bound  on  the  second  derivative.  There  are  many  obvious  applications 
for  this  algorithm.  For  example,  when  finding  a  posteriori  error  bounds 
for  the  approximate  solution  of  elliptic  j^tial  differential  equations, 
we  may  need  to  find  the  maximum  of  Jf(x)  \  (Fox,  Henrici  and  Moler  (1967)). 
Instead  of  working  with  |f(x)  \  ,  which  may  have  discontinuous  derivatives, 
it  is  probably  better  to  use  the  relation 

max  |i(x)  j  =  -min(rain(f(x)),  min(-f(x)))  .  (1*1) 

X  XX 

In  Sections  7  and  8  we  show  how  to  extend  the  method  to  functions  of 
several  variables,  and  ALGOL  60  procedures  are  given  in  Section  10. 
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6.1 


Some  fundamental  limitations 
If  feC[a,b]  ,  let 

<Pf  =  inf  {f(x)  |  xe[a,b  ] }  ,  (1.2) 

and 

=  inf  {xe[a,b  j  |  f(x)  =  cpf}  .  (1.3) 

Even  if  f  satisfies  very  stringent  smoothness  conditions,  the  problem 
of  finding  is  improperly  posed,  in  the  sense  that  is  not  a 

continuous  function  of  f  (with  the  uniform  topology  on  C[a,b]  ). 

For  example,  consider 

fft(x)  =  cos(rrx)  -8x  (1.4) 

on  [-2,2]  .  If  6  >  0  then  ~  1  ,  but  if  5  <  o  then  ~  -1  , 
so  a  very  small  change  in  f  can  cause  a  large  change  in  . 

Instead  of  trying  to  approximate  ,  we  should  seek  to  approximate 
<Pf  =  f(nf)  •  Since 

bf-fgl  <  lif-sll  (i-5) 

for  all  f  ana  g  in  Cfa,b],  is  a  continuous  function  on  C[a,b]  ,  so 
the  problem  of  finding  <p^.  is  properly  posed.  However,  given  t  >  0  , 
it  is  still  impossible  to  find  y  such  that 

|<P-<Pf|  <  t  (1.6) 

witn  a  finite  number  of  function  evaluations,  unless  we  have  some 
a  priori  information  about  f  . 


I4i 


t 


▲ 


< 
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A  priori  conditions  on  f 

If  feC[a,b]  ,  the  modulus  of  continuity  w(f;6)  is  defined  (as 
in  Section  2.2)  by 

w(f;5)  =  sup  |f(x)-f(y)|  (1.7) 

|x-y  1  <6 

x,ye[a,b] 

for  6  >  0  .  Suppose  that  a  function  W(8)  is  given  such  that 

11m  W{8)  =  0  ,  (1.8) 

5  -0+ 

and 

w(f;8)<W(8)  (1.9) 

for  all  6  >  0  .  Given  t  >  0  ,  choose  8  >  0  such  that 

W(5)  <t  (1.10) 

(always  possible  by  (1.8)),  and  evaluate  f  at  points  xQ, .  ..,x  in 


[a,b]  such  that 


max  min  | x  -  x.  |  <  8 

X£[a,b]  0<i<n 


(1.11) 


(For  example,  we  might  choose  xQ  =  a+8  ,  x.^  =  a+55  ,  xg  =  a* 58  ',  etc.) 

If 


<P  =  min  f(x  )  , 

0  <i  <n 


then,  from  (1.7),  (1.9),  (1.10)  and  (l.ll). 


0  <  <P  -<Pf  <  t 
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(1.13) 
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Thus,  a  quite  weak  condition  on  f  ,  enabling  us  to  approximate  Cp^ 
with  a  finite  number  of  evaluations  of  f  ,  is  that  we  have  a  bound 
W(6)  ,  satisfying  (1.8),  on  the  modulus  of  continuity  w(f;S)  of  f  . 

For  example,  if  feC^[a,b]  and 

Ill'll  <  «  <  (1-1*) 

then  we  can  take 

W(5)  =  MS  .  (1.15) 

Unfortunately,  the  procedure  suggested  above  will  be  very  slow  if 
t  is  small:  in  fact,  about  (b-a)M/ (2t)  function  evaluations  will  be 
required.  In  the  worst  case,  though,  it  is  impossible  to  do  much  better 
than  this  without  knowing  more  about  f  .  To  see  this,  consider 
minimizing  a  function  which  is  ,  known  to  be  in  the  class 

[f  (x)  =  min  (l.Olt,  M|x-c|)  |  ce[a,b]}  .  (1.16) 

c 

✓ 

If 

5  =  l.Olt /M  ,  (1.17) 

A 

and  <p  is  computed  from  (1.12)  for  some  set  of  points  xQ,  >  then 

there  is  a  choice  of  ce[a,b]  for  which  $  fails  to  satisfy  (1.13) 
unless  (1.11)  holds,  so  at  least  j”  (b-a)M  /  (2.02t)  ~|  function  evaluations 
are  required.  In  some  cases  less  function  evaluations  will  be  required: 
for  example,  if 

f(x)  =  Mx  ,  (1.18) 

then  it  is  enough  to  evaluate  f  at  a  and  b  .  (See  also  Section  5*) 
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Instead  of  having  an  a  priori  bound  on  f*  |  ,  we  could  have  a 


bound 


f^ll  <  M 


(1.19) 


on  If 


,  for  some  r  >  1  .  We  show  below  that,  with  such  a  bound, 


the  maximum  number  of  function  evaluations  required  to  find  cp 
satisfying  (1.13)  is  of  order  (M/t)^r  . 

The  case  r  =  1  is  discussed  above,  so  suppose  r  >  2  ,  and  let 


(b  ~  a) 
**•  cos(^) 


(1.20) 


Define  8  =  ,  eu  =  a+i5  for  i  =  0,  ...,n  (so  an  =  b)  ,  and 

f  cos((j  -IWr)! 

■«  '  ■>  ‘  ' 1  '  OQ.(|  n/r)  ) 

for  i  =  0,  ...,n-l  and  j  =  1,  ...,r  (so  a.  -  =  a.  ,  a  =  a.  ,)  . 

1)1  X  1)T  1*  X 

Let  P  =  IP(f;a.  ,,..., a,  )  be  the  polynomial  of  degree  r-1  which 
i  l,-L  l,r 

coincides  with  f  at  a,  a.  .  Then,  Lemma  2.4.1  and  the  bound 

i,i  i,r 


(1.19)  show  that,  for  all  xe[ai,ai+1]  , 

|f(x)  -Fi(x)|  <  |(x-a.^1)...(x-ai^r)|  J^r! 


The  right  side  of  (1.22)  is  no  greater  than  [  - 


(1.22) 


(-S—  Y  -s 

V  cos(57) /  r!: 


and,  by  (1.20)  and  the  choice  of  6  ,  this  is  no  greater  than  t/2  .  Thus, 
we  need  only  find  the  minimum  of  each  polynomial  F^(x)  in  [a^,a^+^] 
to  within  a  tolerance  t/2  .  This  is  easy  if  r  =  2  ,  for  then  each 
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polynomial  P^x)  is  linear.  If  r  >  2  then  we  can  bound  jPV(x)  | 
in  [a^.,a^+1]  ,  and  apply  the  procedure  for  r  =  2  to  minimize  P^(x)  • 
(This  idea  for  finding  bounds  on  polynomials  in  an  interval  was  suggested 
by  Rivlin  (1970).)  Because  successive  intervals  are  adjacent, 

the  number  of  function  evaluations  required  to  find  $  satisfying  (1.13) 
does  not  exceed 

N  =  (r-l)n  +  2  ,  (1*23) 

where  n  is  given  by  (1.20) . 

Since  N  is  of  order  (M/t)^r  ,  the  method  described  above  is 
not  likely  to  be  practical  for  small  t  unless  r  >  2  .  On  the  other 
hand,  in  practical  problems  it  is  usually  difficult  to  obtain  good  bounds 
on  the  third  or  higher  derivatives  of  f  (if  they  exist).  Thus,  in  the 
rest  of  this  chapter  we  suppose  that  r  =  2  .  It  turns  out  that  81  one¬ 
sided  bound 

f"(x)  <  M  (1.2U) 

is  sufficient,  instead  of  the  two-sided  bound  (1.19) .  If  f"(x)  has  a 
physical  interpretation  (e.g.,  as  an  acceleration),  then  a  bound  of  the 
form  (1.2^)  can  sometimes  be  obtained  fran  physical  considerations. 

2.  The  basic  theorems 

The  global  minimization  algorithm  which  is  described  in  the  next 
section  depends  on  the  simple  Theorems  2.1,  2.2  and  2.3*  Theorem  2.1  is 
related  to  the  maximum  principle  for  elliptic  difference  operators,  and 
also  to  some  results  in  Davis  (1965).  We  assume  that  feC^[a,b]  ,  and 
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f'(x)  -  f* (y)  <  M(x-y)  ,  (2.1) 

for  all  x,y  in  [a,b]  with  x  >y  .  (Weaker  conditions  suffice: 
see  Section  7*)  If  feC  [a,b]  then  the  one-sided  Lipschitz  condition 
(2.1)  is  equivalent  to 

f"(x)  <  M  (2.2) 

for  all  xe[a,b]  . 

Theorem  2.1 

Suppose  (2.1)  holds.  Then,  for  all  xe[a,b]  , 

f(x)  >  .  (2.3) 


Proof 

The  proof  is  immediate  from  Lemma  2.4.1. 

Lemma  2.1 

Suppose  (2.1)  holds  and  a  <  0  <  b  .  Then 

f’(0)  <  |  Ma  .  (2.4) 

cl  cL 


Proof 

Applying  Lemma  2.3.1  to  f(-x)  ,  we  have 

f(a)  <  f (0)  +  af ’  (0)  +  |  Ma2  ,  (2.5) 

so  the  result  follows. 
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Theorem  2.2 


Suppose  (2.1)  holds,  M>0,  a  <  c  <  b  ,  f(a)  >  f(c)  ,  and 
f’(c)  =  0  .  Then 


c  -  a 


f(*)  -  f(c) 


lM 


(2.6) 


Proof 

Applying  Lemma  2.1  with  a  suitable  translation  of  the  origin  gives 

0  =  f *  (c)  <  |  M(a-c)  ,  (2.7) 

so 

f (a)  -  f (c)  <|M(c-a)2  ,  (2.8) 

and  the  result  follows. 


Lemma  2.2 

Suppose  (2.1)  holds,  M  >  0  ,  and  a  <  0  <  b  <  -f*  (0)/M  .  Then 
f  *  (b)  <  0  . 

Proof 

By  condition  (2.1), 

f '  (b)  <  f»(0)  +Mb  ,  (2.9) 

and,  as 

b  <  -f * (0)/M  ,  (2.10) 

the  result  follows. 
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Theorem  2.3 


Suppose  (2.1)  holds,  M  >  0  ,  a  <  c  <  b  ,  and 


c  <x  <min  (b,Si£  -  ^5°)) 


(2.11) 


Then 


f  *  (x)  <  0  . 


(2.12) 


Proof 


There  is  no  loss  of  generality  in  assuming  that  c  =  0  and  b  =  x  . 


By  condition  (2.11), 


,  1  f  (a)  -  i 

b  -  x  <  -=■  a  -  ■  1 . . 

-  2  Ma 


-sO 


-  |  Maj  , 


(2.13) 


so,  by  Lemma  2.3,  we  have 


b  <  -f'(0)/M 


(2.14) 


Now  the  result  follows  from  Lemma  2.2. 


Remarks 


Theorems  2.1,  2.2  and  2.3  are  sharp,  as  can  easily  be  seen  by 

1  2 

taking  f (x)  as  a  suitable  parabola  with  leading  term  ^  M*  •  The 
theorems  are  generalized  in  Section  7 >  and  the  proofs  given  there  show 
that  everything  needed  to  justify  our  minimization  algorithm  follows 
from  the  fundamental  inequality  (2.3).  The  proofs  given  in  this  section 
are,  however,  simpler  and  more  intuitive  than  those  in  Section  7* 


f  v 
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3.  An  algorithm  for  gLobal  minimization 


p 

Suppose  that  feC  [a,b]  and,  for  all  xe[a,b]  , 

f"(x)  <  M  .  (3-1) 

A  A  A 

We  want  to  find  ^€[a,b]  and  qp  =  f(^i)  satisfying 

♦ 

|$-<pf|<t  ,  (3.2) 

where  t  is  a  given  positive  tolerance,  and 

cp  =  min  f(x)  .  (3.3) 

xe[a,  b] 

If  M  <  0  the  problem  is  quite  trivial,  for  Theorem  2.1  says  that  f(x) 
can  not  lie  below  the  straight  line  interpolating  f  at  a  and  b  ,  so 

q>f  =  min  (f(a),f(b))  .  (3*4) 

If  M  >  0  the  problem  is  not  trivial,  although  we  saw  in  Section  1  that 
there  does  exist  an  algorithm  to  solve  it. 

The  basic  algorithm 

The  algorithm  described  in  this  section  is  an  elaboration  and 
refinement  of  the  following  basic  algorithm.  (The  notation  is  consistent 
with  that  of  the  ALGOL  procedure  glomin  (Section  10) ,  except  that  we 

*  A 

write  M  for  m  ,  n  for  x  ,  cp  for  y  (=  glomin),  and  e  for 
macheps . ) 

1.  Set  q>  ♦-  min  (f(a),f(b))  , 

|i  *-  if  =  f(a)  then  a  else  b  , 
and  a2  *-  a  . 
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some  point 


m. 


1 - 1 
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2. 

If 

M  <  0  or  a^  >  b  then  halt. 

Otherwise  set  a,  —  some 

3 

in 

(a2,b]  (e.g.,  b:  see  below 

for  a  better  choice) . 

3. 

If 

f(a^)<  9  then  set  £  *-  t  and  £  -  f(a^)  . 

4. 

If  the  parabola  y  =  P(x)  ,  with 

P"(x)  =  M  ,  P(a  )  =  f(a  ) 

and  P(a^)  =  f(a^)  ,  satisfies  P(x)  >  <p  - 1  for  all  x  in  [a^a^]  , 
then  go  on  to  5  •  Otherwise  set  a.^  —  ^  (a^+a^)  and  go  back  to  3  . 
5-  Set  a^  *-  a^  and  go  back  to  2  . 

We  shall  see  shortly  that  (with  a  sensible  choice  of  at 

step  2)  the  basic  algorithm  must  terminate  in  a  finite  number  of  steps. 

In  view  of  Theorem  2.1  and  step  4,  it  is  clear  that,  when  the  algorithm 
terminates,  it  does  so  with  $  satisfying  (3.2). 

Refinements  of  the  basic  algorithm 

The  crux  of  the  problem  is  how  to  make  a  good  choice  of  a^  at 
step  2  of  the  basic  algorithm.  We  want  to  choose  a^  as  large  as 
possible,  but  not  so  large  that  it  has  to  be  reduced  at  step  4. 

Theorems  2.2  and  2.3  provide  useful  lower  bounds.  If  the  global  minimum 
lies  outside  (a^,!)  ,  or  if  (p^  >£-t  ,  then  the  algorithm  may 
halt,  for  $  already  satisfies  (3.2).  Otherwise 


f'(nf)  =  0 

(3-5) 

f(nf)  <  i  - 1  , 

(3.6) 

so,  fran  Theorem  2.2  with  a  replaced  by  a^  and  c  by  (i^  , 
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“  a2  ^ 


f(a2)  -cp  +  t 

i M 


(3.7) 


Thus 


,  at  step  2  it  is  safe  to  take  a^  =  a^  ,  where 


a'  = 


f(a)-$+t 

,b,  a  . 

? 

1  ^ 

2  M  J 

(3-8) 


and  with  this  choice  there  is  no  risk  that  a,  will  have  to  be  reduced 

3 

at  step  V.  Since  the  right  side  of  (3.7)  is  at  least  (2t/M)^2  ,  the 
basic  algorithm  must  converge  in  a  finite  number  of  steps  if,  in  step  2, 
we  choose  any  a^  in  the  range  [a^,b]  • 

If  f  is  decreasing  rapidly  at  a2  ,  then  Theorem  2.3  may  give  a 
bettfr  bound  than  (3*7)*  Apply  Theorem  2.3  with  c  replaced  by  a2 
and  a  replaced  by  a  point  ag  -  dQ  (with  dQ  >  0)  where  f  has 
already  been  evaluated.  (This  is  not  possible  if  a2  =  a  .)  Combining 
the  result  with  (3-8),  we  see  that  it  is  safe  to  choose  a^  =  a^  at 


step  2,  where 


L 


a^  =  minj  b,  max  ^  a2  + 


Ln^b 


f(aQ)  -f(i)  +  t 


f(aj  -  f(a0  -  dj  +  2. Ole 


2  M,d0 


Here  e  is  a  positive  tolerance,  and  the  term  2. Ole  is  introduced 
to  combat  the  effect  of  rounding  errors  (see  equations  (3.^1)  and  (3*52)) 
The  choic®  a ^  =  a^  is  safe,  but  it  is  possible  to  speed  up  the 
algorithm  by  sometimes  choosing  a^  >  a~  .  Because  we  want  to  avoid 


(5.9) 
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having  to  decrease  a^  at  step  4,  the  best  choice  would  be  to  take 
a^  =  min  (b,  a^)  ,  where  a^  is  the  abscissa  of  the  point  to  the  right 
of  a^  where  the  curve  y  =  f(x)  intersects  the  parabola  P  ,  with 
second  derivative  M  ,  which  passes  through  (ap, f(a2))  and  attains 
its  minimum  value  9*  -t  to  the  right  of  a^  .  Here 

9*  =  min  (9  ,  f(a^))  (3-10) 

/s 

is  the  value  of  9  after  stop  3  has  been  executed,  and  we  can  extend 
the  danain  of  f  by  defining  f(x)  =  f (b)  for  x  >  b  if  this  is 
necessary.  A  typical  situation  is  illustrated  in  Diagram  3*1» 


Diagram  3.1:  The  points  ag  and  ai 


It  is  not  practical  to  choose  ^  =  a^  ,  for,  although  a^  exists, 
several  function  evaluations  are  needed  to  approximate  it  accurately. 
Procedure  glcmin  (flection  10)  finds  a  rough  approximation  a^*  to  a^  , 
without  any  extra  function  evaluations,  by  assuming  that  f  can  be 
approximated  sufficiently  well  by  the  par ab ole  which  interpolates  f  at 
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the  last  three  points  at  which  f  has  been  evaluated  To  avoid 
overstepping  a^  too  often,  because  of  the  inadequacy  of  the  parabolic 
approx  mat  ion  to  f  ,  the  procedure  uses  a  heuristic  "safety  factor" 
he(0,l)  .  If 

a^  =  min  (b  ,  a^+  h(aj*  -  ag))  ,  (3-H) 

then  at  step  2  we  choose 

a^  =  max  (a^,a?)  ,  (3.12) 

and  if  it  necessary  to  reduce  a^  at  step  4  then  we  set 
a^  *-  max  (a^  ,  ^  (a2+  a}))  *  Pr°cedure  glomin  also  makes  a  rather 
primitive  attempt  to  adjust  h  ,  che  adjustment  depending  on  the  outcome 
of  step  b. 

Some  details  of  procedure  glanin 

The  ALGOL  60  procedure  glomin  given  in  Section  ID  uses  the  basic 
algorithm  with  the  refinements  suggested  above.  From  equation  (3*8) 
and  the  criterion  in  step  4  of  the  basic  algorithm,  it  is  clear  that, 
to  speed  up  convergence,  we  want  to  find  a  rough  approximation  to  the 

A 

global  minimum  as  soon  as  possible.  In  other  words,  <p  should  be 
nearly  at  its  final  value  as  soon  as  possible.  For  this  reason,  procedure 

A 

glomin  incorporates  several  strategies  which  are  designed  to  reduce  <p 
quickly.  We  emphasize  that  the  global  minimum  would  be  found  without 
using  these  strategies;  the  strategies  merely  reduce  the  number  of 
function  evaluations  required  (see  Sections  5  and  6). 

The  first  strategy  for  reducing  $  quickly  is  a  pseudo-random 
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search.  About  10  percent  of  the  function  evaluations  are  used  to 
evaluate  f  at  ''randan”  points  uniformly  distributed  in  (a^b)  . 

(f  is  not  evaluated  at  the  random  point  if  Theorem  2.1,  with  a 

replaced  by  a^  and  x  by  a^  ,  indicates  that  f(a^)  >  $  - 1  ,  for 
such  an  evaluation  would  be  a  waste  of  time.)  At  worst,  this  strategy 
wastes  10  percent  of  the  function  evaluations,  but  in  practice  the 
saving  in  function  evaluations  caused  by  quickly  finding  a  good  value 

a 

of  <p  is  often  much  more  than  10  percent.  (The  choice  of  10  percent 
is,  of  course,  rather  arbitrary.) 

By  comparison  with  the  random  search  strategy,  the  second  strategy 
is  a  highly  "non-random”  search.  f  is  evaluated  at  the  minimum 
of  the  parabola  which  interpolates  f  at  the  last  three  points  at  which 
f  has  been  evaluated,  provided  that  this  point  a ^  lies  in  (a?,b) 

and  Theorem  2.1  does  not  show  that  the  evaluation  is  futile  for  the  purpose 
of  reducing  .  The  details  are  similar  to  those  of  procedure  localmin 
(see  Chapter  5)  •  This  strategy  helps  to  locate  the  local  minima  of  f 
which  are  in  the  interior  of  [a,b]  ,  and,  unless  the  global  minimum  is 
at  a  or  b  ,  one  of  these  local  minima  is  the  global  minimum.  A  bonus 
is  that,  if  f  is  sufficiently  well-behaved  near  the  global  minimum 
(see  Chapter  5  for  more  precise  conditions),  then  the  minimum  will  be 
found  more  accurately  than  would  be  expected  with  the  basic  algorithm. 

The  numerical  examples  given  in  Sections  6  and  8  illustrate  this.  To 
avoid  wasting  function  evaluations  by  repeatedly  finding  the  same  local 
minimum,  this  strategy  is  only  used  about  once  in  every  tenth  cycle, 
although  it  is  always  used  if  $  =  f(a0)  ,  for  then  there  is  a  good 
chance  that  f(a^)  <  cp  , 
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Finally,  the  user  may  be  able  to  make  a  good  guess  at  the  global 
minimum.  For  example,  he  may  know  a  local  minimum  which  is  likely 
to  be  the  global  minimum,  or  he  may  know  the  global  minimum  of  a 
slightly  different  function  (see  the  application  discussed  in  Section  8). 

Thus,  procedure  glomin  has  an  input  parameter  c  which  may  be  set  by 
the  user  at  the  suspected  position  of  the  global  minimum,  and  on  entry 

a 

the  procedure  evaluates  f  at  c  in  an  attempt  to  reduce  <p  .  If  the 
user  knows  nothing  about  the  likely  position  of  the  global  minimum,  he 
can  set  c  =  a  or  b  . 

We  can  now  summarize  procedure  glomin  (for  points  of  detail,  see 

Section  10) .  Step  1  of  the  basic  algorithm  is  performed,  and  the 

algorithm  terminates  immediately  unless  M  >  0  and  a  <  b  .  Before 

choosing  a^e(a2,b]  at  step  2,  the  strategies  described  above  are  used 

* 

to  try  to  reduce  q>  .  Then  a^  is  chosen,  and  perhaps  reduced  at 
step  4,  as  described  above. 

The  reader  who  is  not  very  interested  in  the  murky  details  of 
procedure  glomin,  or  in  the  effect  of  rounding  errors,  would  be  well 
advised  to  skip  the  rest  of  this  section. 

Some  of  the  formulas  used  by  procedure  glomin  need  an  explanation. 

When  either  the  random  or  non-random  search  strategy  is  performed,  we 
have  numbers  q  and  r  ,  and  wish  to  determine  if  the  relation 

1  /  0  A  (a2  <  a2+  r/q  <  b)  A 
(b-(a  +  r/q))f(ap)  +  (r/q)f(b) 

-  ±  M(r/q)(b-(a  +  r/q))  <  <p -t  (5-13) 


Y  -  fl 
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is  true.  If  m2  =  ^  M  >  0  ,  zg  =  b  -  a2  >  0  ,  =  f(b)  ,  and 

y2  =  f(a2)  ,  then  (5.13)  is  equivalent  to 

q[r(yb-y2)  +  z2q(y2-9+t)]  <  z^r^q.- r)  ,  (5-14) 


which  is  the  condition  tested  after  label  "retry'*  of  procedure  glotnin. 
(If  q  =  0  then  (3.1*0  is  false,  and  it  is  also  false  if  a2  +  r/q 
lies  outside  (a2,b)  ,  since  m2  >  0  and  $-t  <  min  (y2,yb)  .) 

To  approximate  a^  ,  we  need  the  point  a^*  where  the  parabola 
y  =  P(x)  ,  passing  through  (a^y^)  for  i  =0,1,2  ,  intersects  the 
parabola 


(In  procedure  glcmin  we  use  c  in  place  of  a1  to  save  a  storage 

location.)  Let  z^  =  y2  -  y^  ,  z^  =  y2  -  yQ  ,  dQ  =  a2  -  a^  ,  =  a2  -  aQ  , 

and  d2  =  a^  -  aQ  .  In  the  non-random  search  we  have  already  computed 

numbers  p  and  q  (r  and  q  above)  with 

s 

P  =  ^z0  -  d^z1  (3.16) 

and 

is  =  2<Vi-Vo>  >  (3-17) 

in  order  to  find  the  turning  point  a0  +  p/q  of  P(x)  .  By  forming 
the  quadratic  equation  for  a^*  ,  and  dividing  out  the  unwanted  root  ag  , 
we  find  that 

aj*  =  a2  +  p»/q'  ,  (5-18) 
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6.3 

where 


p'  =  p  +  2rs  , 
q'  =  r+|  qs  , 
r  =  dodging  , 


and 


s 


y2  -  $  + 1 


n 


m,. 


(3.19) 

(3.20) 
(3.21) 


(3.22) 


Finally,  there  is  the  inspection  of  the  lower  bound  on  f 
(a2,a^)  given  by  the  parabola 

(<yx)y  +  (x-<^)y 

y=  — - — - *  -  m2(x  -a2)(a5  ~x)  , 

where  ^  M  >  0  and 

%  =  V*2  >0  . 


in 


(3.23) 


(3.24) 


If 


P  = 


(3.25) 


then  the  parabola  (3-23)  is  monotonic  increasing  or  decreasing  in 
(a2,a^)  provided 

|p|>dQ  .  (3.26) 


Otherwise,  the  parabola  (3.23)  attains  its  minimum  in  (a2,a^)  ,  and 

1  12  2'  1 
the  minimum  value  is  ^  (y.^+ y^)  -  5  m2(dQ  +  p  )  at  x  =  2  (ag+a^  +  p)  . 

Thus,  at  step  4  of  the  basic  algorithm,  a,  must  be  reduced  if 
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*• - »  1 - J  t _ <  l  .  _J 


t.  J  l - , 


i.e.,  if 


|P|  <d0  A|  (y2  +  y3)  -  tm2(do  +  p2)  *  > 


|p|  <  d0  A  J  M(d^  +  p2)  >  (y2  -$)  +  (y?  -$)  +  2t 


(3-27) 


(3.28) 


The  effect  of  rounding  errors 

So  far  we  have  ignored  the  effect  of  rounding  errors,  which 
actually  occur  both  in  the  computation  of  f(x)  and  in  the  internal 
computations  of  procedure  glomin.  How  we  show  how  these  rounding  errors 
can  be  accounted  for. 

Let  £  be  the  relative  machine  precision  (parameter  macheps  of 
procedure  glomin),  i.e., 


0  (truncated  arithmetic), 

1  1— T 

•x  0  (rounded  arithmetic), 


for  t -digit  floating-point  arithmetic  to  base  0  .  We  suppose, 
following  Wilkinson  (I9S3),  that 


fl(x#y)  =  (x#y)(l+5)  , 


(3.29) 


where  ^  stands  for  any  of  the  arithmetic  operations  +  ,  -  ,  x  >  /  > 


5  <  e 


(3.30) 


On  machines  without  guard  digits,  the  relations  (3.29)  and  (.3-30)  may 
fail  to  hold  for  addition  and  subtraction:  we  may  only  have  the  weaker 
relation 


6.3 


fl(x+y)  =  x(l+  &x)  +  y(l+S2)  , 

where  >  (3-31) 

|6  |  <  e  for  i  =  1,2  .  I 

With  these  machines  it  seems  difficult  to  be  sure  that  rounding  errors 
committed  inside  procedure  glomin  are  harmless.  At  any  rate,  our 
analysis  depends  heavily  on  relation  (3.29).  (See  equation  (3»52)  and 
the  following  analysis.) 

We  also  suppose  that  square  roots  are  computed  with  a  small  relative 
error,  say 

fl(sqrt(x))  =/x(l+3&)  , 

where 

&|  <  e 

(Any  good  square  root  routine  should  satisfy  (3-32)  very  easily.  The 
library  routines  for  the  IBM  360  certainly  do:  see  Clark,  Cody,  Hillstrom 
and  Thieleker  (1967).) 

Let  us  first  consider  the  effect  of  rounding  errors  in  the  computation 
of  f  ,  supposing  for  the  moment  that  the  internal  computations  of 
procedure  glomin  are  done  exactly.  The  user  has  to  provide  procedure 
glomin  with  a  positive  tolerance  e  which  gives  a  bound  on  the  absolute 
error  in  computing  f  .  More  precisely,  we  assume  that,  for  all  5  and 
x  with  1 6 1  <  e  and  x  ,  x(l+5)  in  [a,b]  ,  we  have 

• 

|fl(f(x(l+ 5)))  -  f(x)  |  <  e  ,  (3.33) 

where  f(x)  is  the  exact  mathematical  function  (satisfying  condition 
(2.1)),  and  fl(f(x))  is  its  computed  floating-point  approximation.  The 
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reason  for  condition  (3 .33)  will  be  apparent  later:  at  present  we  only- 
need  the  special  case  with  5  =  0  ,  i.e.. 


|fl(f(x))  -  f(x)  I  <  e 


(J.Jlt) 


for  all  xe[a,b]  . 

We  have  seen  that,  without  rounding  errors,  procedure  glcmin  would 

a  ^ 

return  <p  (or  y  =  glotnin)  and  n  (or  x)  satisfying 


a  A 


<p  <(p  =  f(n)  <  <p  +t  . 


(3.35) 


With  rounding  errors,  (3*35)  no  longer  holds,  but  we  shall  show  that 


<Pf  <  t(i)  <<Pf+t+2e 


(3.36) 


9  -  e  <  9  =  fl(  f(Jl) )  <  9*.  +  t  +  e 


(3.37) 


If  the  error  e  in  computing  f  is  much  less  than  the  tolerance  t  , 
then  (3.36)  and  (3-37)  are  much  the  same  as  (3*35 )>  so  rounding  errors 

A 

have  little  effect  on  the  accuracy  of  9  . 

The  left  hand  inequality  in  (3.36)  is  obvious  from  the  definition 
of  9f  .  To  prove  the  right  hand  inequality,  we  must  look  closely  at 
the  ’'critical"  sections  of  procedure  glomin,  i.e.,  the  sections  where 
rounding  errors  could  make  an  essential  difference.  (Examples  of  non- 
critical  sections  are  the  random  and  non-random  searches.) 

In  computing  the  safe  choice  a"  for  a*  according  to  equation 
(3.9) i  we  compute 


y2  -  <p  +  t 


(3.38) 


ftfl  »»_!! 


FIJI 


6.3 


and 


r  = 


--I  Mo  + 


(Z0  +  2. Ole) 
40“2  , 


(3.59) 


where  =  a2  -  a.^  ,  zo  =  y2  ”  yl  *  m2  =  £  M  > 
and  y  «  fl(f(ai))  for  i  =  1,2  .  Thus 


s  < 


f(a2)  -  f(|l)  +  (t  +  2e) 

®2 


j  =  fi(f(i))  , 


(3.140) 


so,  as  far  as  the  computation  of  s  is  concerned,  everything  said 
above  holds  if  t  is  replaced  by  t  +  2e  .  (Remember  that  we  are 
regarding  all  computations  inside  the  procedure  as  exact.)  We  are  only 
interested  in  r  when  d^  >  0  and  >  0  ,  and  as 

zQ  +  2. Ole  >  zQ  +  2e  >  f(ag)  -  ^(a-^)  > 


we  have 


r  < 


f(a2)  -  f(ax)  A 

J 


(3.41) 


(The  reason  for  the  extra  O.Ole  will  be  apparent  later.)  Thus,  the 
computed  a^  will  not  exceed  the  correct  value  given  by  (3*9)#  if  t 
is  replaced  by  t  +2e  . 

The  other  point  where  rounding  errors  in  the  computation 
of  f  are  critical  is  when  we  determine  whether  the  parabola  y  =  P(x)  , 
with  P"(x)  =  M  ,  P(a2)  =  y2  ,  and  P(a^)  =  y^  ,  lies  above  the  line 

A 

y  =  <p-t  in  the  interval  (ap,a^)  .  Let  y  =  Q(x)  be  the  parabola 
with  Q"(x)  =  M  ,  Q(a2)  =  f(ag)  ,  and  Q(a^)  =  f(a^)  .  Since 
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yi  =  flff^))  <  f(a±)  +  e  for  i  =  2,3  , 
it  is  clear  that 

P(x)  <  Q(x)  +  e  (3.42) 

in  (a2,a^)  .  Thus,  if 

F(x)  >  <p  -  t  (3.43) 

in  (a0,a^)  ,  then 

Q (x)  >$  -  t  -  e  >  f (p.)  -  t  -  2e  (3.44) 

in  (ap,  a^)  ,  so  again  everything  is  accounted  for  by  changing  t  to 
t  +  2e  .  This  completes  the  proof  of  (3.36).  The  left  inequality  in 
(3*37)  is  obvious,  and  the  right  inequality  follows  from  the  above 
argument  if  we  note  that  it  is  sufficient  to  replace  t  by  t+e+(f(£)  -cp) 
Now,  let  us  consider  the  effect  of  rounding  errors  committed  inside 
procedure  glomin.  We  shall  show  that  (3.36)  and  (3*37)  still  hold, 
provided  some  minor  modifications  are  made  in  the  algorithm.  These 
modifications  are  included  in  procedure  glomin,  but,  to  avoid  confusion, 
they  were  not  mentioned  in  the  description  above.  The  most  important 
modification  is  that,  instead  of  having  m2  =  ^  M  ,  procedure  glomin  has 

ra2  =  fl(|(l+l6e)M)  ,  (3.45) 

where  the  factor  1+  l6e  is  introduced  purely  to  nullify  the  effect 
of  rounding  errors. 

2 

For  the  sake  of  simplicity,  terms  of  order  e  are  ignored  in  the 
rest  of  this  section.  Because  of  the  slack  in  some  of  our  inequalities, 


From  (3.45)  and-  the 


6.3 

these  terms  may  be  accounted  for  if  £  -  5oo  * 
assumption  (3*29) ,  we  certainly  have 


m2  >  |  (1+  1?E)M 


(3.1*6) 


In  the  computation  of  a”  according  to  (3»9)>  procedure  glomin 


actually  computes 


3,iapd>-2) 


1 

2 


(3-47) 


and  as  errors  in  the  computation  of  f  have  already  been  accounted  for, 
we  can  assume  that  y2  and  <p  are  exact  floating-point  numbers.  From 
(3.46^  and  the  assumptions  (3*29)  and  (3*32), 


a  <  (l+3\) 


((y24)(i+51)+t)(i+62)(i+&5) 


1 

2 


|  M(  1+130 


(3.48) 


where  |5  {  <  e  for  i  =  1,...,4  .  Since  y2-$  and  t  are  both 
nonnegative. 


(y2-$)(l+ E)  *t  <  (y2-$+ t)(l+ e)  , 


(5-1*9) 


SO 


(3.30) 


Thus,  the  slight  modification  of  m2  has  ensured  :hat  the  computed  & 

• 

is  no  greater  than  the  exact  s  .  Note  that,  in  the  derivation  of 

a 

(3.50),  it  was  essential  that  y2  -cp  was  computed  with  a  small  relative 
error,  so  the  assumption  (3.29)  was  necessary:  (3*31)  would  not  be  enough. 
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S_milarly,  to  find 


a; 


we  actually  compute 


(y2  -  yx) 


+  2. Ole 


(3.51) 


where  e  >  0  ,  m9  >  0  ,  and  a2  >  ai  *  are  only  interested  in  r 
if  r  >  0  ,  so 


0  >  fl((y2-y;L)  +  2.01e) 

>  ((y2-y1)(l+E)+2.01e(l-e))(l+e) 

>  (y2-y;L+2e)(l+e)2  ,  (3-52) 


assuming  that  e  <  .  (The  reason  for  the  extra  O.Ole  in  (3*39)  is 

now  clear . )  Thus 

r  =  fl(-  |  (r1+  r2))  ,  (3-53) 

where 

0  <  (ag-a^Hl-e)  <  t±  <  (a2"ai)  (1+e)  (3. 5*0 

and 

(y  -y  +2e)(l-9e) 

0>r  > - r -  .  (3*55) 

5"(V‘i) 

Since  r  >0  ,  (3*53)  shows  that  Jr-jJ  <  |r2|  ,  so,  from  (3*53)  to 

(3.55), 


r  <  r  < 


(3-56) 
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As  before,  the  computed  r  is  no  greater  than  the  correct  r  .  The 
sane  is  not  true  for  a"  ,  tne  computed  value  of  a^  ,  but  a^  is 
either  b  ,  fl(a0  +  r)  ,  or  fl(a2+s)  .  Suppose,  for  example,  that 

I"  =  fl(a2+s)  .  (3.57) 

Then 

fl(f(ap)  =  fl(f((a2-i)(l+6)))  (3.58) 

where  |&|  <  e  ,  so,  from  (3 . 33)  > 

|fl(f(a"))  -  f(a2+i)  |  .;  e  .  (3-59) 


(This  is  why  we  required  (3*33)  instead  of  the  weaker  (3*3M*)  Thus, 
the  error  in  computing  a2+ s  or  a2+  r  can  be  ignored,  for  it  has 
been  absorbed  into  the  assumption  (3 .33)  on  e  . 

Finally,  we  have  to  consider  the  effect  of  rounding  errors  when 
testing  the  condition  (3.23).  First 


is  computed.  It  is  important  to  note  that  we  use  2M 
slightly  different  m2  (given  by  (3*^5))  here.  Thus 

yP  -y7 

P  =  - - £ - -  (1+  5&l)  , 

2  M  (a3  ■  a2) 

and 

30  =  fl(a5-a2)  =  (a3  -  a2)(l+  &2)  > 


(3.60) 


not  the 


(3.61) 


(3.62) 
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where  |5^|  <  e  for  i  =  1,2  . 

The  test  actually  made  by  procedure  glomin  is  whether 

|p|  <  fl((l+90d0)  A  fl(|m2(d^  +  p2))  >  fl[(y2-9)+(y5-$)+25]  ,  (5-63) 

and  we  shall  show  that  (3*63)  is  true  whenever  the  condition  (3.28)  is 
true.  First,  |p|  <do  implies  that  jpj  <  d^(l+5e)  ,  and  thus 

|p I  <  fl((l+9E)a0)  -  (3.64) 

Similarly,  if  Jp  |  <  aid 

(d2  +  p2)  >  (y2-$)  +  (y7 -$) +2t  ,  (3.65) 

then 

^  +  P2  >  (dO  +  p2)(l  -  60  ,  (3.66) 

so 

fl(|  m2(d2  +  i2))  >  £  M(d2  +  p2)(l  +  4e) 

>  ((y2  -  9)  +  (y5  -  cp)  +  2t)(l  +  3e) 

>  fl((y2  -  $)  +  (y3  -  $)  +  2t)  .  (3.67) 

A  A 

(Note  the  importance  of  grouping  the  terms:  since  y2~<p  ,  y^  -cp  and 
2t  are  all  nonne^ative,  their  sum  can  be  computed  with  a  small  relative 
error . ) 

From  (3.64)  and  (3.67),  the  inexact  test  (3.63)  results  in  a^  being 
reduced  whenever  the  exact  test  (3*28)  says  that  it  must  be.  a^  may 
occasionally  be  reduced  unnecessarily  because  of  rounding  errors,  but 
this  does  not  invalidate  the  bounds  (3-36)  and  (3.37) >  it  merely  causes 
seme  unnecessary  function  evaluations. 
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We  should  mention  a  remote  possibility  that  rounding  errors  can 

prevent  convergence.  This  is  only  possible  if  fl(ag+s)  =  a^  ,  and, 

—  l/2 

as  s  >  (1  -  l4e)  (  2t/M  )  '  ,  there  is  no  chance  of  it  happening  provided 

t  >  M£2  max(a^,b2)  .  (3.68) 

Thus,  convergence  can  only  be  prevented  by  rounding  errors  if  t  is 
unreasonably  small. 

A  A 

In  conclusion,  procedure  glorain  is  guaranteed  to  return  <p  and  p 
satisfying  the  bounds  (3.36)  and  (3*37);  provided  the  input  parameters 
macheps,  t  and  e  are  set  correctly. 


4.  The  rate  of  convergence  in  some  special  cases 

It  is  difficult  to  say  much  in  general  about  the  number  of  function 
evaluations  required  by  the  algorithm  described  in  Section  3*  In  the 
next  section  we  compare  the  algorithm  with  the  best  possible  one  for 
given  M  and  t  .  In  this  section,  we  try  to  gain  some  insight  into  the 
dependence  of  the  number  of  function  evaluations  on  the  bound  M  and 
the  tolerance  t  ,  by  looking  at  some  simple  special  cases. 

The  wor3t  case 

As  pointed  out  above  (equation  (3.4)),  two  function  evaluations 

A  A 

are  enough  to  determine  p  and  p  if  M  <  0  ,  so  suppose  that  M  >  0  , 
and  let 
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We  showed  above  that,  if  the  last  function  evaluation  was  at  a2e[a,b)  , 
we  could  safely  choose 


a^  =  min(b,a2  +  5) 


(4.2) 


for  the  next  evaluation  (step  2  of  the  basic  algorithm).  With  this 
simple  choice  of  a^  ,  about  (b-a)/5  function  evaluations  would  be 
required.  Procedure  glamin  tries  to  do  better  than  this,  and  is  nearly 
always  successful  (see  Section  6),  but  the  worst  that  can  happen  is 
that  a^  will  be  chosen  to  be  b  ,  and  then  a^  will  be  reduced  several 
times  at  step  4  of  the  basic  algorithm.  As  a^-a2  halve<*  eac^ 

such  reduction  of  a,  ,  there  can  be  at  most 

3 


(*.3) 


consecutive  reductions  of 


*3 


at  step  4.  Thus,  at  worst,  about 


(4.M 


function  evaluations  will  be  required.  We  have  ignored  the  random  and 
•nonrandom  searches,  but  these  can  only  add  about  2(— — )  extra  function 
evaluations . 


If  6  is  given  by  (4.1),  the  term  log2(— ^ — )  in  (4.4)  varies 

only  slowly  with  M  and  t  ,  so  the  upper  bound  is  roughly  proportional 
l/2 

to  (b-a)(M/t)  '  .  In  particular,  the  upper  bound  is  roughly  proportional 

to  /m  ,  and  it  seems  to  be  a  good  general  rule  that  the  number  of  function 
evaluations  is  roughly  proportional  to  /m  ,  even  when  the  upper  bound 
(4.4)  is  not  attained  (see  below  and  Section  6'. 
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A  straight  line 

If  the  global  minimum  of  f  occurs  at  an  endpoint  p  =  a  or  b  , 
and  f  *  ((i)  /  0  ,  we  can  gain  an  insight  into  the  behaviour  of  the 
algorithm  near  p  by  considering  the  linear  approximation  f(p)  +  (x-p)f*(p) 
to  f(x)  .  Suppose,  for  example,  that 

f(x)  =  k(x  -  a)  +  t  (4.5) 

for  same  k>0,so  p  =  a.  Ignoring  the  random  searches,  the 
algorithm  will  evaluate  f  at  the  points  a  ,  b  ,  c ,  and  then  at 

points  x^  <  x2  <  x^  <  . . .  <  Xjj  ^  say*  where  X0  *  a  <  X1  '  *  b  > 

and  the  points  (xn,  f(xn))  and  (x  fi (xn+i))  lie  on  ^he  parabola 

y  =  P  (x)  which  touches  the  line  y  =  0  and  has  P”(x)  =  M  .  (See 

Diagram  4.1.)  If  F  (x)  touches  y  =  0  at  x  =  an  ,  then 

P.(x)  =|M(x-an)2  ,  (4.6) 

so 

°n  *  Xn+J  I  (V>H>  =  Vl  -  1 1  (k(Vl -a)+t)  •  <4‘?> 

If 

Zn  =  \  xn-a  +  t/k  >  (k'Q) 


then  (4.7)  gives 


Thus 


(M) 


(4.10) 
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PtWriw  k.l;  A  line,  f(*)  -  it(#-a)  +  t  (for  N  =  6) 

Two  limiting  (-•ttses  of  (J*.12)  we  interestingr  If  t  is  small  and 
1  nol  t-oo  small,  so  that  H(h  -  a)  »  t  ,  then 


tfl4ch  Is  in4epen4ent  of  t  .  (In  this  section  we  are  neglect  lag  the 
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effect  of  rounding  errors,  but  these  should  not  be  important  if  t 
satisfies  the  weak  condition  (3.68).) 

If  k  is  very  small,  so  that  k(b-a)  «  t  ,  then  (4.12)  gives 

N  (b-a)/8  ,  (4.14) 

and  the  algorithm  proceeds  in  steps  of  size  about  26  ,  where  6  is 
given  by  (4.1) . 


A  parabola 

If  the  global  minimum  of  f  occurs  at  an  interior  point  p  ,  then 
f’(p)  =  0  ,  so  if  f"(p)  /  0  we  may  analyse  the  behaviour  of  the 
algorithm  near  p  by  considering  the  parabolic  approximation 
f(p)  +  ~  f'(p)  (x-p)  to  f(x)  .  Thus,  suppose  thatv 


M  >  m  >  0  (4.15) 

and 

12  1 
f(x)  =  |  m(x-p)  + 1  ,  (4.16) 

where  pe(a,b)  .  The  nonrandom  search  will  quickly  locate  p  ,  so  we 
may  suppose  that  p  =  p  ,  and,  without  loss  of  generality,  p  -  0  .  The 
algorithm  will  call  for  the  evaluation  of  f  at  points  to  the  left,  and 
then  to  the  right,  of  p  .  As  these  two  cases  are  similar,  let  us 
define  x^  =  p  =  0  ,  and  study  the  points  x  ,x^  ...  defined  above, 
except  that  now  f  is  given  by  (4.l6)  instead  of  by  (4.5) .  In  place 
of  (4.7),  we  find  hat 


a 

n 


x  + 
n 


m  ,  2  2ts 

.  -  '  x  +  — ) 
>1  M  n  m 


n+1 


I 


2 

n+1 


+ 


(4.17) 
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It  does  not  seem  to  be  possible  to  give  a  simple  expression  like 
(4.U)  for  ,  defined  by  the  recurrence  relation  (4.17),  but  we  may 
solve  for  x^+1  in  terms  of  xn  ,  obtaining 

x  ♦  (VI  I  s  (X2  ♦  Si)  .  (4.18) 

n+1  1  M-m  In  l  M-m  I  d  M  v  n  m  ' 

If 

p  =  (M/m)1/2  ,  (4.19) 


this  may  be  written  as 


(4.20) 


Suppose  that  p  is  close  to  1 
than  m  =  f"(n)  •  Then 

■■  ■(*)!»  • 


i.e.,  M  is  not  much  larger 


(4.21) 


For  n  >  1  ,  the  first  term  in  (4.20)  dominates  the  second,  and 

Vi  *  f^W1+0((p'1)S))  aB  p-1-  <-k-S2) 

Thus,  if  p  is  close  to  1  ,  then 


(4.23) 


for  n  >  1  ,  and,  as  the  factor  is  large,  only  a  few  function 

evaluations  will  be  required. 


5.  A  lower  bound  on  the  number  of  function  evaluations  required 
Suppose  that  a  positive  tolerance  t  and  bound  M  sure  given, 
that  f  attains  its  global  minimum  <p^  in  [a,b  ]  at  |i^  ,  and  that 


f"(x)  <  M 


(5.1) 


for  all  x*_[a,b]  .  (Similar  results  to  those  below  hold  if  equality  is 
allowed,  but  the  definitions  and  proofs  have  to  be  modified  slightly.) 
First,  we  need  a  lemma. 


Lemma  5«1 

If  x'e[a,b)  ,  then  there  is  at  most  one  point  x',e(x',b]  ,  such  that 
the  parabola  y  =  P(x)  ,  with  P"(x)  =  M  ,  P(x’)  =  f(x’)  ,  and  touching 
the  line  y  =  <p „-t  ,  satisfies  P(x")  =  f(x")  . 


Proof 


Suppose,  by  way  of  contradiction,  that  two  such  distinct  points  x" 


and  x'’'  exist.  Then 


M  =  2f[x,,x,,,x"»]  =  f"(|) 
for  some  |c[x’,^]  (see  Chapter  2),  contradicting 


f"(i)  <  M  . 


(5.2) 


(5.3) 


Definition  5.1 

For  x'c[a,b)  ,  define 


«<*•)  = 


xM  if  the  point  x"  of  Lemma  5*1  exists. 


b  otherwise. 
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Lemma  5*2  shows  that  N  is  finite,  in  fact 

N  <  1  +  r(b-a)(M/(8t))1/2~]  •  (5-5) 

The  following  lemma  shows  that,  in  order  to  prove  that  f(x)  >  cp^  - 1 
for  all  xe[a,b]  ,  given  only  condition  (5-1),  it  is  suf f ic lent  to 
evaluate  f  at  x^,x^, • . .,x^  . 

Lemma  5*3 

p 

If  geC  [a,b]  ,  g"(x)  <M  for  all  xea,b  ,  and 

g(*n)  =  f(*n)  (5.6) 

for  n  -  1,2, ...,N  and  the  points  xn  defined  above,  then 

<pg><pf-t  .  (5.7) 

Proof 

The  lemma  follows  immediately  from  the  definitions  and  Theorem  2.1. 
(Clearly,  weaker  conditions  on  g  ,  e.g.  condition  (2.1),  are  sufficient.) 

Our  interest  in  the  points  x^, •••,XN  stems  from  the  following 
theorem,  which  complements  Lemma  5*5. 

Theorem  5.1 

-- y  -J 

Let  x£  <  x£  <  ...  <  x^  be  any  v  points  in  [a,b]  ,  with  v  <  N  . 
Then  there  is  a  function  geC“[a,h]  ,  satisfying 

g”(x)  <  M  *  (5-8) 

for  all  xe[a,b]  ,  and 
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g(*;)  =  f(*i)  (5.9) 

for  n  =  1,2,  ...,v  ,  such  that 

<pg<qpf-t  >  (5.10) 


Proof 

Suppose,  by  way  of  contradiction,  that 


(5-11) 

for  all  such  g  -  Then  x^  =  a  ,  for  otherwise  -g(a)  can  be 
arbitrarily  large,  and,  similarly,  x^  =  b  .  Since  v  <  N  ,  there  is 
an  n,  l<n<v,  such  that  x*  <  x  and  x’ , ,  >  x  .  Thus, 

the  parabola  y  =  P(x)  ,  with  P"(x)  =  M  ,  P(x^)  =  f(x^)  »  and 
P(x^+i)  =  f(x4+1)  *  is  such  that 


min 

xc[x',x'  ,  ] 
n  n+lJ 


P(x)  <  <P,>-t 


(5.12) 


Since  there  ?s  a  function  g  as  above  which  is  arbitrarily  close  to 
P(x)  in  tXn»x^+2.^  *  ^is  contradicts  (5. 11),  so  the  theorem  holds. 


Consequences  of  the  theorem 

Theorem  5*1  says  that,  if  all  that  is  known  a  priori  about  f  is 
that  feC  [a,b]  and  satisfies  condition  (5*1),  then  any  algorithm, 
which  is  guaranteed  to  find  n  so  that  f(^i)  <  <p  + 1  ,  must  require 
at  least  N  evaluations  of  f  .  This  is  so  because,  if  an  algorithm 
required  only  V  <  N  evaluations  at  points  x|  <  x^  <  . . .  <  x^  ,  say, 
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then  it  would  be  sure  to  fail  for  either  f  or  for  g  ,  for  f  and  g 

are  indistinguishable  on  the  basis  of  the  v  function  evaluations, 

yet  <p  +  t  <  <p  .  Of  course,  we  are  only  considering  algorithms  which 
&  * 

sequentially  evaluate  f  at  a  finite  number  of  points. 

Conversely,  Lemma  5*3  implies  that  N+l  function  evaluations  are 
sufficient  (just  evaluate  f  at  and  x^,  ...,x^)  ,  and  possibly  N 

are  sufficient.  (See  Diagram  5*1*)  Unfortunately,  Lemma  5*3  does  not 
give  us  an  effective  algorithm  for  approximating  cp^.  ,  for  we  do  not 
know  N  or  the  points  xr ,  .  ,.,x^  1  in  advance,  and  a  large  number  of 
function  evaluations  is  usually  needed  to  approximate  them- 

Efficiency 

Suppose  that  an  algorithm  requires  N*  function  evaluations  to 
find  =  f(p)  such  that  <p  <qpf+t  is  guaranteed.  We  could  define 
the  efficiency  E  of  the  algorithm  by 

E  =  N/N'  .  (5-13) 

(Note  that  E  depends  on  f,M,t,a  and  b  ,  as  well  as  on  the 
algorithm.)  We  have  shown  that 

E  <  1  (5.1*0 

for  any  correct  (i.e.,  guaranteed)  algorithm,  so,  if  an  algorithm  has 
an  efficiency  close  to  1  ,  then  we  are  justified  in  saying  that  the 
algorithm  is  nearly  optimal  (for  that  f  ,  M  ,  t  etc.).  In  the  next 
section  we  give  numerical  results  which  show  that,  for  practical  examples, 
the  algorithm  described  in  Section  3  is  often  nearly  optimal. 
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6.  Practical  tests 

The  ALGOL  procedure  glomin  given  in  Section  10  was  tested  using 

ALGOL  W  (Wirth  and  Hoare  (1966),  Bauer,  Becker  and  Graham  (1968))  on  an 

IBM  360/91  computer  with  machine  precision  l6  .  Some  representative 

numerical  results  are  summarized  in  Table  6.1.  For  ail  of  these 

-14  -13 

results  the  parameters  e  and  macheps  were  set  at  10  and  16 
respectively. 

The  table  gives  the  upper  bound  M  (parameter  ra  of  glomin)  on  f"  , 

and  the  total  number  of  function  evaluations  required  by  procedure  glomin: 

—  S  —TP 

N"  with  tolerance  t  =  10  ,  and  N*  with  tolerance  t  -  10  .  The 

-12 

lower  bourd  N  defined  in  Section  5  is  also  given  for  t  =  10 
(Recall  that  no  algorithm  which  is  guaranteed  to  succeed  can  take  less 
than  N  function  evaluations.)  N  and  the  points  x^, ...,x^  (see 
Section  .1.0  computed  in  the  obvious  way  from  Definition  5.2,  using 
procedure  zero  of  Chapter  h  to  solve  the  nonlinear  equation 

P(x)  =  f(x)  ,  (6.1) 

where  P(x)  is  the  parabola  of  Lemma  5*1*  Finally,  the  efficiency 
E  =  N/N'  (equation  (5*13))  is  given. 

For  some  more  numerical  results,  see  Section  8. 
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Table  6.1:  Numerical  results  for  procedure  glomin 


The  symbols  are  explained  above.  The  functior^e  are: 
f^(x)  =  2  -  x  on  [7>9]  (in  all  cases  \i  =  9,  9  =  7  ), 

O  A  A 

fg(x)  =  x  on  [-1,2]  (in  all  cases  |i  =  <p  =  0)  , 

f3(x)  =  X2  +  X3  on  [-  \  ,  2]  (for  t  =  10-12,  |^|  <  5.10_1°  ,  |J|  <  6.ic'20) 

f^(x)  =  (x +  sin(x))exp(-x^)  on  [-10,10]  (p  =  -0. 6795786599525  , 

9  =  -O.82U239398U76077)  ,  and 

f  (x)  =  (.:  -  sin(x)  )exp(  -x2)  on  [-10,10] 

(S  =  -1.1951366U1665  ,  9  -  -0.063U90523936U399)  • 


6.? 

Comments  on  Table  6.1 


2 

The  results  for  the  simple  functions  f^(x)  =  2  -x  and  f2(x)  =  x 
verify  the  predictions  made  in  Section  4.  For  example,  the  values  N  =  11 
and  N  =  101  for  f^  are  exactly  as  predicted:  one  more  than  the 
right  side  of  equation  (4.12).  N  ,  N*  and  N"  are  roughly  proportional 
to  /i  if  M  »  f"(n)  (see  also  the  results  for  f^)  ,  but  this  rule 
breaks  down  if  M~  f"(n)  >  as  expected  from  equation  (4.23).  (See  the 
results  for  f 2  with  M  =  2,  2.1,  2.2.) 

It  appears  that  the  number  of  function  evaluations  does  not  depend 
strongly  on  t  :  comparing  N”  with  N*  ,  we  see  that  the  average 
number  of  function  evaluations  required  is  only  about  20  percent  more 
for  t  =  ID-12  than  for  t  =  10  ® 

Finally,  the  efficiency  E  of  the  algorithm  is  fairly  high,  even 
for  the  difficult  functions  f^  and  f<-  .  This  means  that  no  correct 
algorithm  based  entirely  on  function  evaluations  could  do  very  much  better 
than  ours,  at  least  on  these  examples.  This  is  not  too  surprising,  in 
view  of  the  results  of  Section  5* 


7.  Some  extensions  and  generalizations 

2 

So  far  we  have  assumed  that  fcC  [a,b]  and 

f" (x)  <  M  (7-1) 

for  all  xe[a,b]  ,  or  at  least  that  f  eC^[a,b]  and 

f»(x)  -  f'(y)  <  M(x-y)  (7-*) 
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for  a  <  y  <  x  <  b  .  Condition  (7*2)  was  necessary  to  prove  the  basic 
Theorem  2.1.  For  the  application  discussed  in  Section  8  (global 
minimization  of  a  function  of  several  variables),  we  need  to  find  the 
global  minimum  of  a  function  which  is  continuous,  but  not  necessarily 
differentiable.  We  can  justify  using  procedure  glcmin,  even  though  f 
may  not  be  differentiable,  because  of  the  following  Theorems  7*1  to  7^3> 
which  generalize  Theorems  2.1  to  2. 3.  (If  the  reader  is  prepared  to 
accept  the  fact  that  Theorems  2.1  to  2.3  can  be  generalized  in  the 
appropriate  way,  he  may  skip  this  section.) 

Theoron  7*1 

Let  feC[a,b]  ,  and  suppose  that  there  is  a  constant  M  such 
that,  for  all  sufficiently  small  h  >  0  , 

f(u+h)  -  2f(u)  +  f(u-h)  <  Mh2  (7-3) 

for  all  ue[a+h,b-h]  .  Then,  for  all  xe[a,b]  , 

f(x)  >  (b-x)f(a)  .1mMM  .  (7.4) 


Proof 

There  is  no  loss  of  generality  in  assuming  that 


f  (a)  =  f(b)  =  0 

and 

M  =  0  , 


(7-5) 

(7.6) 


for  we  can  consider  f(x)  -  P(x)  ,  where  P(x)  is  the  right  side  of 
(7-M,  instead  of  f(x)  .  Thus,  we  have  to  show  that 
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<Pf  >  0  , 

(7-7) 

where  cp  is  the  least  value  of  f  on  [a,b] 

.  Suppose,  by  way  of 

contradiction,  that 

<Pf  <  o  , 

(7-8) 

and  let 

u  =  sup{xt[a,b]  |  f(x)  =  cpf]  • 

(7-9) 

By  the  continuity  of  f  ,  f(u)  =  <  0  ,  so 

u  /  a  or  b  .  Thus, 

for  sufficiently  small  h  >  0  ,  ue[a+h,  b-h] 

,  and,  from  the 

definition  of  u  , 

f(u-h)  >  f(u) 

(7-10) 

and 

f (u+h)  >  f(u)  . 

(7.H) 

Because  of  the  assumption  (7*6),  this  contradicts  (7.3),  so  (7*8)  is 
impossible,  and  the  result  follows.  (Note  the  close  connection  with 
the  maximum  principle  for  elliptic  difference  operators.) 


Theorem  7 »2 

Suppose  that  (7*5)  holds,  M>0,  a  <  c  <  e0  <  b  ,  and 
f(a)  >  f(cx)  =  f(c2)  .  Then 


f(a)  -  f(c  ) 


(7.12) 


Proof 

Apply  Theoran  7*1  with  x  replaced  by  end  b  by  c  .  The 
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hypothesis  that  f(c  )  =  f(c2)  gives,  after  some  simplification, 

f(a)-f(c) 

(c  -a)(c«-a)  >  - ? - >  (7*15) 

|  M 

and  the  result  follows  as  c2-a>c^-a>0. 

Theorem  7  »3 

Suppose  that  (7-3)  holds,  M  >  0  ,  a  <  c  <  b  ,  and  the  interval 
I  =  [c,b]  fl  [c  ,  has  positive  length.  Then  f(x) 

is  strictly  monotonic  decreasing  on  I  . 


Proof. 

Suppose  xx,x2  c  I  with  \  <  x2  '  We  have  to  show  that 
f(xx)  >  f(xg)  . 


(7.1*0 


Apply  Theorem  7.1,  first  with  x  replaced  by  c  and  b  by  x1  , 
then  with  a  replaced  by  c  ,  x  by  x^  and  b  by  x2  .  The  two 
resulting  inequalities  give,  after  seme  simplification, 


(7*15) 


A.  *  Ar 

Since  _L__i  <  ,  the  right  side  of  (7-15)  is  positive,  so  (7.1*0 

holds . 


Remarks 

Theorems  7.1  to  7. 3  generalize  Theorems  2.1  to  2.3  respectively. 
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Since  the  algorithm  described  in  Section  3  is  based  entirely  on 
Theorems  2.1  to  2.3,  it  is  clear  that  condition  (7*3)  is  sufficient  for 
the  algorithm  to  find  a  correct  approximation  to  the  global  minimum 
of  f  .  This  is  not  surprising,  for  condition  (7*3)  is  equivalent  to 
(7*2)  if  feC'L[a,b]  ,  and  is  equivalent  to  (7*1)  if  feC2[a,b]  .  In  the 
next  section,  we  use  this  result  to  develop  an  algorithm  for  finding  the 
global  minimum  of  a  function  f  of  several  variables.  The  conditions 
on  f  are  much  weaker  than  those  required  by  Newman  (1965),  Sugie  (196^), 
or  Krolak  and  Cooper  (1963)*  (See  also  Kaupe  (1964)  and  Kiefer  (1957)*) 


8.  An  algorithm  for  global  minimization  of  a  function  of  several  variables 

Suppose  that  U  =  [a  ,b  ]  x  [a  ,b  ]  is  a  rectangle  in  R“  , 

xx  y  y 

f :  D  R  has  continuous  second  derivat  ives  on  D  ,  and  constants  M 
and  M  are  known  such  that 

V 

•j 

fxx(x,y)  -  Mx  t8-1) 

and 

fyy(x,y)  <  My  ,  (8.2) 

for  all  (x,y)eD  .  Let  us  define  cp:  [a  ,b  J  -*  R  by 

y  y 

<P(y)  =  min  f(x,y)  .  (8.3) 

X£[V*>x1 

Clearly  <p(y)  is  continuous,  and 

min  f(x,y)  =  rain  qp(y)  .  (8.4) 

(x,y)cD  ye[ay,by] 


184 


6.8 


Thus,  we  have  reduced  the  minimization  of  f(x,y)  ,  a  function  of  two 
variables,  to  the  minimization  of  functions  of  one  variable.  Procedure 
glomin  (see  Sections  3  and  10)  can  be  used  to  evaluate  Cp(y)  for  a 
given  y  ,  using  condition  (8.1) .  If  we  could  show  that 

q>"(y)  <My  ,  (8.5) 

then  procedure  glomin  could  be  used  again  (recursively)  to  minimize 
<p(y)  ,  and  thus,  from  (8.4),  f(x,y)  .  Unfortunately,  examples  show 
that  <p(y)  need  not  be  differentiable  everywhere  in  [a  ,b  ]  ,  so 

y  y 

(8.5)  may  be  meaningless  (we  shall  see  below  that  (8.5)  holds  when 
cp"(y)  exists).  For  example,  consider 

f(x,y)  =  xy  (8.6) 

on  Q  =  [-1,1]  x  [-1,1]  •  Then 

<p(y)  =  min  (y,-y)  =  -|y|  ,  (8.7) 

which  is  not  differentiable  at  y  =  0  ,  and  we  can  not  expect  to  prove 

(8.5) .  The  same  problem  may  arise  if  the  minimum  in  (8.3)  occurs  at  an 
interior  point  of  D  :  one  example  is 

f(x,y)  =  (x5  -  3x)sin(y)  (8.8) 

on  D  =  [  V5  ,  /3]  x  [  -10,10]  .  (f^(x,y)  vanishes  for  x  =+  1  , 

so  <p(y)  =  -2|sin(y)|  ,  which  is  not  differentiable  at  0  ,  +  n  ,  etc.) 

Fortunately,  the  following  theorem  shows  that  cp(y)  does  satisfy 
a  condition  like  (7-3),  so  the  results  of  Section  7  show  that  procedure 
glcmin  can  be  used  to  find  the  global  minimum  of  cp (y)  ,  just  as  if  (8.5) 
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Functions  of  n  variables 

Theorem  3.2  ^ueralizes  Theorem  8.1  to  functions  of  any  finite 
number  of  variables . 

Theorem  8.2 

Suppose  that  n  >  1  ,  I  is  a  nonempty  compact  set  in  R  for 

n+ 1 

i  =  1, . .  .,n+l  ,  D  =  11  x  IQ  x  ...  x  I  x  c  R  ,  f :  D  -  R  is  continuous, 
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Proof 

The  proof  is  a  straight-forward  generalization  of  the  proof  of 
Theorem  8.1,  so  the  details  are  omitted. 

Theorem  8.2  shows  that  it  is  possible  to  use  procedure  glomin  to 

find  the  global  minimum  of  a  function  f(x^, ...,xn)  of  any  finite 

number  n  >  1  of  variables,  provided  upper  bounds  are  known  for  the 

partial  derivatives  f  (x)  (i  =  1,  ...,n)  'T  It  is  interesting  that 

X  ,  X  , 
i  l 

no  bounds  on  the  cross  derivatives  f  (x)  (i  /  j)  are  necessary. 

j 
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If  a  one -dimensional  minimization  using  procedure  glomln  requires 
about  K  function  evaluations,  then  we  would  expect  that  about  K° 
function  evaluations  would  be  required  for  an  n-dimensional  minimization. 
Since  K  is  likely  to  be  in  the  range  ID  <  K  <  100  in  practice  (see 
Section  6),  the  computation  involved  is  likely  to  be  excessive  for 
n  >  3  .  Thus,  for  functions  of  more  than  three  variables,  we  probably 
must  be  satisfied  with  methods  which  find  local,  but  not  necessarily 
global,  minima  (see  Chapter  7)*  It  should  be  noted,  however,  that  the 
theorems  of  Section  5  do  not  extend  to  functions  of  more  than  one 
variable,  so  we  do  not  know  how  i ar  our  procedure  is  from  the  best 
possible  (given  only  upper  bounds  on  f  for  i  =  1,  ...,n  ).  Thus, 

X  •  X  • 

1  1 

there  is  a  chance  that  a  much  better  method  for  finding  the  global 
minimum  of  a  function  of  several  variables  exists.  It  is  also  possible 
that  slightly  stronger  a  priori  conditions  on  f  (e.g.,  both  upper 
and  lower  bounds  on  certain  derivatives)  might  enable  us  to  find  the 
global  minimum  much  more  efficiently. 

Minimization  of  a  function  of  two  variables:  procedure  glomin2d 

In  Section  10  we  give  an  /1G0L  60  procedure  (glcmin2d)  for  finding 
the  global  minimum  of  a  function  f(x,y)  of  two  variables,  using  the 
method  suggested  above.  Note  that  glomin2d  uses  procedure  glanin  in  a 
recursive  manner,  for  glanin  is  required  both  to  evaluate  and  to 
minimize  <p  .  The  error  bounds  given  in  the  initial  comment  of  procedure 
g.lamin2d  are  easily  derived  from  the  error  bounds  (3.36)  and  (3.37)  for 
procedure  glorcin. 
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Procedure  glomin2d  was  tested  on  an  IBM  360/91  compute?1  (using 
ALGOL  W),  and  some  numerical  results  are  summarized  in  Table  8.1.  In 
all  cases  shown  in  the  table  the  parameters  macheps  ,  e  and  t  were 

.17  .lh,  —TO  _”jli  ^ 

set  at  16  J  ,  10  and  10  respectively.  (Thus  <p^  - 10  <9 

<  <pf+ 1.0002  x  10”10  is  guaranteed,  where  qp^,  is  the  true  minimum  of  f  , 

A 

and  cp  is  the  value  1  churned  by  the  procedure.)  In  the  table  we  give 
the  upper  bounds  Mx  and  (see  equations  (8.1)  and  (8.2)),  the  total 

number  of  function  evaluations  N  ,  and  the  approximate  global  minimum 
(always  very  close  to  the  true  global  minimum  cpp  . 
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Table  3.1:  Numerical  results  for  procedure  glomin2d 


The  symbols  are  explained  above.  The  functions  are: 


y)  =  133  +  99x  -  35y  on  [-1,1]  x  ; 

f2(x,y)  =  x2  +  xy  +  2y2  on  [-1,3]  X  [-2,4]  ; 

f3(x,y)  =100(y-x2)2+  (1-x)2  on  [-1.2, 1.2]  x  [-1.2, 1.2]  ; 
f^(x>y)  =  f3(y,x)  on  the  same  domain; 

f“5(x,y)  =  sin(x)cos(y)exp(-(x2  + y2))  on  [-1,2]  x  [-1,2]  ; 
f6(x,y)  =  f^(x,y)  on  [-2,4]  x  [-2,4]  . 
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Comments  on  Table  8.1 

The  results  for  the  simple  functions  f^  and  fg  are  not  very 

surprising.  As  expected  from  the  behaviour  of  procedure  glcmin  on 

functions  of  one  variable  (see  Sections  5  sund.  6),  the  number  of  function 

evaluations  (N)  increases  with  M  and  M  . 

x  y 

2  2  2 

f^fx, y)  =  100 (y-x  )  +  (1-  x)  is  the  well-known  Rosenbrock 
function  (Rosenbrock  (i960)),  and  it  has  a  steep  curved  valley  along 
the  parabola  y  =  xd  .  f^(x,y)  =  f^(y,x)  is  just  the  Rosenbrock  function 

in  disguise,  and  it  is  interesting  that  only  1815  function  evaluations 
were  required  to  minimize  f^  ,  compared  to  133^0  for  f^  .  Thus,  it  can 
make  a  large  difference  whether  we  minimize  first  over  x  (with  y  fixed) 
and  then  over  y  ,  or  vice  versa,  but  it  is  difficult  to  give  a  reliable 
rule  aG  to  which  should  be  done  first.  Of  course,  even  the  lower  figure 
of  1815  function  evaluations,  is  very  high  by  comparison  with  100  or  less 
for  methods  which  seek  local  minima  (see  Chapter  7)>  but  perhaps  this  is 
the  price  which  must  be  paid  to  guarantee  that  we  do  have  the  global 
minimum.  (This  is  only  a  conjecture,  for  the  results  of  Section  5  have 
not  been  extended  to  functions  of  several  variables.) 

The  functions  fc  and  f^  are  the  same,  but  the  domain  of  f^  is 

« 

four  times  as  large  as  the  domain  of  f^  .  For  this  function  the  size 
of  the  domain  has  much  more  influence  on  N  than  do  the  bounds  M 


and  M 


increasing  the  size  of  the  domain  by  a  factor  of  four  increased 


N  by  a  factor  of  about  $0,  but  doubling  M  and  M  only  increased  N 

x  y 

by  about  30  percent.  With  a  different  function,  though,  we  could  easily 


reach  the  opposite  conclusion,  (fg  is  one  example.) 
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To  summarize:  if  it  is  possible  to  give  upper  bounds  and 
on  the  partial  second  derivatives  f  ar.d  f  ,  then  procedure 

yy 

glomin2d  will  find  a  guaranteed  good  approximation  to  the  global  minimum, 
but  the  number  of  function  evaluations  required  may  be  considerable, 
especially  if  the  domain  of  f  is  large  or  if  the  bounds  Mx  and 
are  weak.  As  for  one-dimensional  minimization,  the  size  of  the  tolerance 


t  has  a  fairly  small  influence  on  the  total  number  of  function  evaluations 


required. 

Finally,  we  should  note  that  we  have  restricted  ourselves  to 
rectangular  domains  merely  for  the  sake  of  simplicity:  there  :is  no 
real  difficulty  in  dealing  with  nonrectangular  domains. 


9.  Summary  and  conclusions 

In  Section  1  we  saw  that  the  problem  of  finding  the  global  minimum 
=  f(pf)  of  a  function  f  defined  on  a  compact  set  is  well-posed, 
whereas  the  problem  of  finding  is  not  well-posed.  To  be  sure  to 

find  the  global  minimum,  some  a  priori  conditions  on  f  are  necessary, 
and  several  possible  conditions  were  discussed  in  Section  1.  We 
concentrated  our  attention  mainly  on  one  such  condition,  a  given  upper 
bound  on  f*  ,  and  small  variations  of  this  condition. 

An  efficient  algorithm  for  one-dimensional,  global  minimization, 
based  on  theorems  in  Sections  2  and  7>  ie  described  in  Section  J.  The 
effect  of  rounding  errors,  and  the  number  of  function  evaluations 
required,  are  discussed  in  Sections  3  to  5,  and  numerical  results  are 
given  in  Section  6.  Finally,  in  Section  8  the  results  for  functions  of 

\ 
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one  variable  are  used  to  give  an  algorithm  for  finding  the  global 
minimum  of  a  function  of  several  variables  (practically  useful  for  two 
or  three  variables),  and  ALGOL  procedures  are  given  in  Section  10.  The 
ALGOL  procedures  are  guaranteed  to  give  correct  results,  provided  the 
basic  arithmetic  operations  are  performed  with  a  small  relative  error 
(see  the  remark  following  equation  (3 .50)). 

For  practical  problems,  the  main  difficulty  in  using  the  results  of 
this  chapter  lies  in  finding  the  necessary  bounds  on  second  derivatives. 
One  intriguing  idea  is  that,  if  f(x)  were  expressed  in  terms  of 
elementary  functions,  then  the  second  derivatives  could  be  computed 
symbolically,  and  upper  bounds  could  then  be  obtained  from  the  symbolic 
second  derivatives  by  using  simple  ineq'ialities.  Thus,  the  entire 
process  of  finding  the  global  minimum  could  be  automated.  In  seme  cases 
functions  defined  on  infinite  domains  could  also  be  dealt  with 
automatically  by  using  suitable  elementary  transformations. 


10.  ALGOL  60  procedures 

The  ALGOL  procedures  glorain  (for  global  minimization  of  a  function 
of  one  variable)  and  glcmin2d  (for  global  minimization  of  a  function  of 
two  variables)  are  given  below.  The  algorithms  and  some  numerical  results 
are  described  in  Sections  3  to  6  and  8. 
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Procedure  glomin 

real  procedure  glomin  (a,  b,  c,  m,  macheps,  e,  t,  f ,  x) ; 
value  a,  b,  c,  m,  macheps,  e,  t; 

real  a,  b,  c,  m,  macheps,  e,  t,  x;  real  procedure  f; 
begin  comment : 

Glanin  returns  the  global  minimum  y  at  x  of  the  function 
f(x)  defined  on  [a,b]  .  The  procedure  assumes  that  feC^[a,b] 
and  f,f(x)  <m  for  all  xe[a,b]  (weaker  conditions  are  sufficient: 
see  the  text).  e  and  t  are  positive  tolerances:  we  assume  that 
f(x)  is  computed  with  an  absolute  error  bounded  by  e  ,  i.e.,  that 
|fl(f(x(l+ macheps)))  -  f(x)  (  <  e  ,  where  macheps  is  the  relative 
machine  precision.  Then  x  and  y  -  glomin  are  returned  so  that 
min(f)  <  f(x)  <  min(f)  +  t  +  2e  and 
min(f)  -  e  <  y  =  fl(f(x))  <  min(f)  +  t  +  e  . 

c  is  an  initial  guess  at  x  (a  or  b  will  do) .  The  number  of 

function  evaluations  required  is  usually  close  to  the  least  possible, 

l/2 

anc  considerably  less  than  (b-a)  (m/8t)  '  ,  provided  t  is  not 

unreasonably  small  (see  Sections  5  to  5)  j 

integer  k;  real  aO,  a2,  a3,  dO,  dl,  d2,  h,  m2,  p,  q,  qs,  r,  s,  y, 

;/o>  yi>  y2,  y3,  yb,  zo,  zi,  z2; 

comment:  Initialization; 
x  :=  aO  :=  b;  a2  :=  a; 
yb  :=  yO  :-  f(b) ;  y  :=  y2  :=  f(a); 
if  yO  <  y  then  y  :=  yO  else  x  :=  a; 
if  it  >  0  A  a  <  b  then 
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begin  comment :  Nontrivial  case  (m  >0,  a  <  b) ; 
m2  :=  0.5  x  (1  +  16  x  macheps)  x  m; 
if  c  <  a  v  c  >b  then  c  :=  0-5  X  (a+b); 
yl  :=  f(c) ;  k  :=  3;  dO  a2  -  c;  h  :  =  9/H> 
if  yl  <  y  then 

begin  x  :=  c;  y  :=  yl  end; 
comment :  Main  loop 
next:  dl  :=  a2  -  aO;  d2:=c-a0; 
z2  :=  b  -  a2;  zO  :=  y2  -  yl;  zl  :=  y2  -  yO; 
p  :=  r  :=  dl  x  dl  x  zO  -  dO  x  dO  x  zl; 
q  :=  qs  :=  2  x  (dQ  x  zl  -  dl  x  zO)  ; 

comment:  Try  to  find  a  lover  value  of  f  using  quadratic  interpolation 

If  k  >  100000  A  y  <  y2  then  go  to  skip; 

retry:  if  q  x  (r  /  (yb-y2)  +  z2  x  q  X  ((y2-y)+t)) 

<  z2  x  m2  x  r  x  (z2  x  q  -  r)  then 
begin  a3  :=  a2  +  r/q;  y3  :=  f (a3) ; 
if  y3  <  y  then 

begin  x  :=  a3;  y  :=  y3 
end 
end; 

comment :  With  probability  about  0.1  do  a  randan  search  for  a  lower 

value  of  f  .  Any  reasonable  random  number  generator  can  be  used  in 

place  of  the  one  here  (it  need  not  be  very  good); 

skip:  k  :=  l6ll  x  k;  k  :=  k  -  1048576  x  (k  *  1048576); 

q  :=  1;  r  :=  (b-a)  x  (k/100000); 

if  r  <  z2  then  go  to  retry; 
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comment ;  Prepare  to  step  as  far  as  possible; 
r  :  =  m2  x  dO  x  X  ii2;  s:=  sqrt((  (y2-y)+t)/m2) ; 
h  :=  0.5  x  (1+h) ; 

p  :=  h  x  (p+2  x  r  x  s);  q:=r+0-5xqs; 

r  :  =  -0.5  X  (dO  +  (zO  +  2.01  x  e)/(dO  x  m2)); 

r  :  =  a2  +  ( rf  r  <  s  v  dO  <  0  then  s  else  r) ; 

comment :  It  is  safe  to  step  to  r  ,  but  we  may  try  to  step  further; 

a3  :=  if  p  x  q  >0  then  a2  +  p/q  else  r; 

inner:  if  a3  <  r  then  a3  :  =  r; 
if  a3  >  b  then 

begin  a3  :=  b;  y3  :=  yb  end 
else  y3  :=  f(a3) ; 
if  y3  <  y  then 

begin  x  :  -  a3 ;  y  :  =  y3  end; 
dO  : —  a 3  -  a2; 
if  a3  >  r  then 

begin  comment:  Inspect  the  parabolic  lower  bound  on  f  in  (a2, a3) ; 
p  :=  2  x  (y2  -  y3)/(m  x  dO) ; 
if  abe(p)  <  (1  +  9  x  macheps)  x  dO 

A  0.5  x  m2  X  (dO  X  do  +  p  X  p)  >  (y2  -  y)  +  (y3  -  y)  +  2  x  t  then 
begin  comment:  Halve  thj  step  and  try  again; 
a3  :=  0.5  x  (a2  +  a3) ;  b  :=  0.9  x  b;  go  to  inner 
end 
end; 

if  a3  <  b  then 
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begin  comment :  Prepare  for  the  next  step; 
aO  :=  c;  c  :=  a2;  a2  :=  a3; 
yO  :=  yl;  yl  :=  y2;  y2  :=  y3; 
go  to  next 
end 
end  • 

glomin  :=  y 
end  glcmin; 


Procedure  glcmin2d 

real  procedure  glomin2d  (ax,  ay,  bx,  by,  mx,  my,  macheps,  e,  t,  f,  x,  y) 
value  ax,  ay,  bx,  by,  mx,  my,  macheps,  e,  t; 
real  ax,  ay,  bx,  by,  mx,  my,  macheps,  e,  t,  x,  y; 
real  procedure  f ; 
begin  comment : 

Glumin2d  returns  the  global  minimum  z  =  f(x,y)  of  the  function 

f(x,y)  defined  on  the  rectangle  [ax,bx]  x  [&y>by]  •  mx  and  my 

are  upper  bounds  on  the  second  partial  derivatives  of  f  :  we 

assume  that  f  (x,y)  <  mx  and  f  (x,y)  <  my  in  the  rectangle, 
xx  yy 

e  and  t  are  positive  tolerances:  f  must  be  evaluated  to  an 

accuracy  of  +e  ,  and  on  return 

min(f)  <  f(x,y)  <  min(f)  +  t  +  3e  auid 

min(f)  -  e  <  z  =  fl(f(x,y))  <  min(f)  +  t  +  2e  . 

macheps  is  the  relative  machine  precision,  and  procedure  glomin  (for 
one-dimensional  minimization)  is  assumed  to  be  global; 
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real  procedure  phi  (y) ;  value  y;  real  v; 

begin  comment :  Returns  nin  f(x,y)  over  x  (y  fixed),  and  may 
alter  the  global  variables  first,  xs  and  zm; 
real  procedure  fx  (x) ;  value  x;  real  x; 

begin  fx  :=  f(x,y)  end  fx ; 
real  ym; 

ym  :=  glotnin  (ax,  bx,  xs,  mx,  macheps,  e,  tl,  fx,  xs) ; 
if  first  v  ym  <  zm  then 

begin  first  :=  false;  zm  :=  ym;  x  :=  xs  end; 
phi  :=  ym 
end  phi; 

real  tl,  xs,  zm;  Boolean  first; 
first  true;  zm  :=  0; 
tl  :=  0-5  x  t;  xs  :=  ax; 

glorain2d  :=  glomin  (ay,  by,  ay,  my,  macheps,  tl  +  e,  tl,  phi,  y) 
end  glamin2d; 
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Chapter  7* 


A  New  Algorithm  for  Minimizing  a  Function  of  Several  Variables 
Without  Calculating  Derivatives 


7-1 


1.  Introduction  and  survey  of  the  literature 

In  this  chapter  we  consider  the  general  unconstrained  minimization 
problem:  given  a  function  f:  Rn  -•  R  ,  find  an  approximate  local  minimum 
of  f  .  There  is  no  need  to  emphasize  the  practical  importance  of  this 
problem,  and  the  recent  literature  on  the  subject  is  quite  extensive. 

Here  we  give  only  a  brief  introduction,  and  no  attempt  is  made  to  duplicate 
the  survey  articles  by  Box  (1966),  Fletcher  (1965,  1959°)  >  and  Powell 
(1970a,  e),  or  the  books  by  Beale  (1968),  Box,  Davies  and  Swann  (1969), 
Jacoby,  Kcrwalik  and  Pizzo  (1971)*  Kowalik  and  Osborne  (1968),  Wilde  (I9&), 
and  Wilde  and  Beightler  (1967)  • 

In  practical  problems  the  global  minimum,  not  a  mere  local  minimum, 
is  usually  of  interest.  Methods  for  finding  global  minima  are  discussed 
in  Chapter  6,  but  for  functions  of  a  moderate  or  large  number  cf  variables 
the  methods  of  Chapter  6  are  impractical.  Usually  the  best  that  we  can 
do,  in  the  absence  of  any  special  knowledge  about  f  ,  is  to  use  a  good 
local  minimizer  and  try  several  different  combinations  of  starting 
positions,  steplengths  etc.,  in  the  hope  that  the  best  local  minimum 
found  is  the  global  minimum. 

Constrained  problems 

It  often  happens  that  we  want  to  minimize  f(x)  subject  to  the 
constraint  that  x  is  in  some  subset  D  of  Rn  .  (Sometimes  f  is 
only  defined  on  D  .)  Simple  upper  and/or  lower  bounds,  of  the  form 

ai  5  X1  <  C1*1) 

on  the  components  of  x  ,  are  particularly  common,  and  problems 


200 


7.1 


with  such  constraints  can  be  reduced  to  unconstrained  problems  by  a 
transformation  of  variables  (see  Box  (19 66))  . 

More  general  constraints  may  be  of  the  form 

g^(x)  =  0  (an  equality  constraint) 
or  *  (1.2) 

g^(x)  >  0  (an  inequality  constraint)  f 

where  g^:  D.  cRn  -»  R  is  some  given  function,  for  i  =  1,  .  ..,m  . 
g.  (x)  may  be  linear,  say 

1  W 

«i(x)  =  a?x  +  c.  (1.3) 

for  some  a^eRn  and  c^eR  ,  or  g^(x)  may  be  nonlinear,  and  perhaps 
quite  difficult  to  compute.  From  the  point  of  view  of  efficiency,  it  is 
probably  best  to  deal  with  linear  constraints  directly,  but  this  is 
difficult  for  nonlinear  constraints.  Direct  methods  for  linear  constraints 
are  given  in  Fletcher  (1968b),  Goldfarb  (1969a),  and  Rosen  (i960).  (See 
also  Bartels  (1968),  Bartels  and  Golub  (1969),  Bartels,  Golub  and 
Saunders  (1970),  Gill  and  Murray  (1970),  Goldfarb  and  Lapidus  (1968), 

Hanson  (1970),  and  Shanno  (1965,  1970b).) 

Problems  with  nonlinear  constraints  can  be  reduced  to  a  sequence  of 
unconstrained  problems  by  the  use  of  penalty  or  barrier  functions.  (See 
Carroll  (1961),  Fiacco  (1961,  1967,  1969),  Fiacco  and  Jones  (1969), 

Fiacco  and  McCormick  (1968),  Fletcher  (1969b),  Fletcher  and  McCann  (1969), 
Jones  and  McCormick  (1969),  Kowalik,  Osborne  and  Ryan  (1969),  Lootsma 
(1968,  1970),  Murray  (1969a,  b),  Osborne  and  Ryan  (1970,  1971), 
Pietrzykowski  (1969),  and  Zangwill  (1967b).)  Attempts  have  also  been  made 
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to  deal  with  nonlinear  constraints  directly.  (See  Allran  and  -Johnsen 
(1970) ,  Box  (1965) ,  Haarhoff  and  Buys  (1970),  Kalfon,  Ribiere  and 
Sogno  (1968),  Luenberger  (1970),  Mitchell  and  Kaplan  (1968),  Murtagh 
and  Sargent  (1969),  Powell  (1969d),  Rosen  (1961),  and  Zoutendijk  (i960, 
1970).) 

Methods  using  derivatives 

Many  methods  for  the  constrained  or  unconstrained  minimization  of 
f :  D  -*  R  explicitly  use  the  partial  derivatives  df/dx^  ,  for 
i  =  1,  ...,n  ,  and  some  methods  also  use  the  second  partial  derivatives 
of  f  .  (Methods  for  constrained  minimization  may  also  use  the  partial 
derivatives  of  the  constraint  functions  .)  For  example,  the 
classical  method  of  steepest  descent  (Akaike  (1959),  Cauchy  (1847), 
Curry  (1944),  Forsythe  (1968),  Goldstein  (1962,  1965),  and  Ostrowski 
(1966,  1967a))  repeatedly  minimizes  f  in  the  direction  -g  ,  where 


is  the  gradient  of  f  .  Perhaps  the  most  successful  methods  using 
derivatives  are  the  Davidon-Fletcher-Powell  "variable  metric"  method 
(Davidon  (1959) ,  Fletcher  and  Powell  (1963),  Huang  (1970),  and 
McCormick  (1969)),  sj.d  the  conjugate  gradient  method  of  Fletcher  and 
Reeves  (1964),  which  is  slower  but  requires  less  storage  than  the 
variable  metric  method.  (For  other  methods  using  derivatives,  and  related 
topics,  see  Bard  (1968,  1970),  Broyden  (1970a,  b),  Cantrell  (1969),  Cragg 
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and  Levy  (1969),  Davidon  (1968,  1969),  Davies  (1968),  Fletcher  (1966, 

1970),  Goldfarb  (1966,  1969b,  1970),  Goldfeld,  Quandt  and  Trotter  (1968) , 
Greenstadt  (1967,  1970),  Hestenes  (1969),  Kelley  and  Myers  (1967) , 

Luenberger  (1969b),  McCormick  and  Pearson  (1969),  Miele  and  Cantrell 
(1969,  1970),  Myers  (1968),  Pearson  (1969),  Powell  (1969b,  c,  I97O0,  c,  d), 
Ramsay  (1970),  Shanno  (1969a,  b),  Shanno  and  Kettler  (1969),  Sorensen 
(1969),  Takahashi  (1965),  Tokumaru,  Adachi  and  Goto  (1970),  Vercoustre 
(1970),  Goldstein  and  Price  (1967),  and  Wells  (1965)-) 

In  many  practical  problems,  it  is  difficult  or  impossible  to  find 
the  partial  derivatives  of  f(x)  directly.  One  possibility  is  to 
compute  derivatives  numerically,  e.g.,  by  finite  differences,  and  then 
use  C!i"  of  the  methods  requiring  derivatives.  Stewart  (1967)  has 
successfully  modified  the  variable  metric  method  so  that  difference 
approximations  to  derivatives  can  be  used.  The  difficulty  is  in 
balancing  the  influence  of  rounding  errors  and  truncation  errors  when 
using  finite  differences  to  estimate  derivatives.  For  a  computer  program, 
see  Lill  (1970) . 

Methods  not  using  derivatives 

Although  Stewart’s  modification  of  the  variable  metric  method 
appears  to  work  well  in  most  practical  cases  (see  Stewart  (1967), 

Powell  (1970a),  and  Section  7),  it  is  more  natural  to  use  a  method  which 
does  not  need  derivatives,  if  derivatives  can  only  be  found  numerically. 
Possibly  such  methods  could  be  more  efficient  than  methods  which  approximate 
derivatives  numerically,  although  this  is  less  clear  in  n  dimensions  than 
i:i  one  dimension  (for  which  see  Chapter  J?)  • 
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Several  methods  which  do  not  use  derivatives  have  been  compared  in 
the  survey  papers  of  Box  (1966),  Fletcher  (1965,  1969c),  Powell  (1970a,  e), 
and  Spang  (1962).  (See  also  Bell  and  Pike  (1966),  Berman  (1969),  Box 
(1957),  Chazan  and  Miranker  (1970),  Hooke  and  Jeeves  (1961),  Kowalik 
and  Osborne  (1968),  Nelder  and  Mead  (1965),  Smith  (1962),  Spendley  (1969), 
Sperdley,  Hext  and  Himsworth  (1962),  Swann  (196^),  and  Winfield  (1967) .) 
Excluding  Stewart's  method,  the  most  successful  method,  especially  for 
functions  of  more  than  three  or  four  variables,  appears  to  be  that  of 
Powell  (196^)  (see  Section  3).  The  main  object  of  this  chapter  is  to 
present  some  modifications  which  improve  the  speed  and  reliability  of 
i swell's  method.  The  modifications  are  discussed  in  Sections  4  to  6, 
and  some  numerical  results  are  given  in  Section  7  • 


for  i,  j  =  1,  ...,n  ,  in  a  neighbourhood  N  of  a  local  minimum  p  . 
Since  p  is  a  minimum,  the  gradient  of  f  vanishes  at  p  ,  and  the 
Hessian  matrix 

A  =  (ftJ)  (1.7) 

ifi  positive  definite  or  eemi-definite.  Near  ji  ,  the  quadratic  form 

Q(x)  =  f(p)  +  |  (x  -  p)T  A(x-p)  (1.8) 
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is  a  good  approximation  to  f(x)  .  Thus,  any  minimization  method,  having 
ultimate  fast  convergence  for  a  general  function  f(x)  with  continuous 
second  derivatives,  must  have  fast  convergence  for  a  positive  definite 
quadratic  form,  and  we  might  expect  the  converse  to  hold  too.  This 
observation  has  led  to  the  investigation  of  methods  which  have  quadratic 
convergence,  i.e,,  which  find  the  minimum  of  a  positive  definite  quadratic 
form  in  a  finite  number  of  function  and/or  derivative  eva?uations,  apart 
from  the  effect  of  rounding  errors.  Examples  of  methods  with  quadratic 
convergence  are  those  of  Da vidon-Fletcher -Powell,  Fletcher  and  Reeves, 
and  Powell  (1964)  (this  is  not  quite  true:  see  Section  3).  The  method 
of  steepest  descent  exhibits  only  linear  convergence  on  a  quadratic  fora, 
so  it  is  not  quadrat ically  convergent. 

A  few  methods  are  not  quadrat  ically  convergent,  for  exact  convergence 
requires  an  infinite  number  of  steps,  but  they  do  exhibit  superlinear 
convergence  on  quadratic  forms.  Examples  are  the  methods  of  Rosenbrock, 
as  modified  by  Davies,  Swann  and  Campey  (see  Swann  (1964)),  of  Goldstein 
and  Price  (1967),  and  of  Greenstadt  (1970)  •  There  is  no  apparent  reason 
why  such  methods  should  fail  to  perform  as  well  as  quadrat i c ally  convergent 
methods  on  general  (nonquadratic)  functions.  Thus,  quadratic  convergence 
is  a  desirable  property,  but  it  is  neither  necessary  nor  sufficient  for 
a  good  minimization  method. 

Stability:  the  descent  property 

In  many  methods  for  unconstrained  minimization,  f(x)  has  been 

evaluated  at  x^  ,  the  current  best  estimate  of  the  position  of  the 
,  .  * 

minimum  of  f(x)  .  A  new  estimate,  x^  ,  is  made  on  the  basis  of  the 
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values  of  f  at  x^  and  a  small  number  of  other  points  (previous  best 

estimates,  or  points  close  to  xQ  ) .  Additional  information  built  up 

from  previous  iterations,  e.g.,  an  approximation  to  the  Hessian  matrix 

* 

of  f  at  Xq  ,  may  also  be  used.  The  prediction  x^  may  be  unreliable, 
and  it  may  happen  that 

f(x*)  >  f(xQ)  .  (1.9) 

For  example,  this  often  occurs  if  xQ  is  not  close  to  a  local  minimum, 

and  an  inadequate  quadratic  approximation  to  f(x)  is  used. 

To  avoid  the  possibility  of  instability,  most  procedures  do  not 
* 

accept  x1  as  the  next  approximation  to  the  minimum.  Instead,  they 

* 

perform  a  "linear  search"  in  the  direction  x^  -  xQ  ,  i.e.,  they  take 
the  point 

ii  “  *0  +  Vh  - 

as  the  next  approximation,  where  is  chosen  to  minimize  the  function 

<P(M  =  f(x0  +  Mx*  -  xQ))  (1.11) 

of  one  variable.  This  ensures  that 

f(xx)  <  f(xQ)  ,  (1.12) 

so  the  successive  points  generated  must  lie  in  the  "level  set" 

S  =  {xeRn  |  f(x)  <  f(xQ)}  .  (1.13) 

In  practice,  it  is  not  worthwhile  to  try  to  minimize  the  function 
<p(\)  very  accurately.  In  feet,  the  minimum  may  not  even  exist:  qp(X)  may 
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be  monotonic  increasing  or  decreasing,  or  have  a  maximum  but  no  minimum. 

Box  (196b)  gives  examples  where  an  attempt  to  minimize  <p(X)  too  accurately 
prevents  a  minimization  procedure  from  finding  the  desired  minimum.  It 
is  sometimes  stated  that  the  quadratic  convergence  property  of  certain 
methods  depends  on  <p(X)  being  minimized  exactly,  but  all  that  is  really 
required  for  these  methods  is  that  the  one-dimensional  minimization 
procedure  minimizes  a  quadratic  function  of  X  exactly.  Thus,  for 
quadratic  convergence,  it  is  sufficient  to  fit  a  parabola  P(X)  to  cp(X)  , 

-X- 

and  take  =  ^  '  where  X^  minimizes  P(\)  .  Because  of  the  danger 

of  instability,  this  simple  procedure  is  not  acceptable,  but  it  is  reasonable 
* 

to  take  XQ  =  XQ  provided  that 

<P(*.q)  <  <P(0)  ,  (1.14) 

which  ensures  that  (1.12)  holds.  (Powell  (i9?0e)  gives  some  reasons 
for  requiring  rather  more  than  (l.lU).)  See  also  Sections  6  and  7* 

Sums  of  squares 

A  very  common  unconstrained  minimization  problem  is  to  minimize  a 
function  f(x)  of  the  form 

f(*)  -  E  tf ,  w  r  >  (i-i5) 

for  some  (generally  nonlinear)  functions  f^(x)  .  For  example,  this 
problem  arises  when  parameters  x^,  ...,x  are  fitted,  by  the  method  of 

least  squares,  using  m  observations.  An  important  special  case  arises 
when  the  minimum  value  of  f(x)  is  zero:  then  we  have  a  solution  of  the 
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system  of  equations 

f^x)  =  0  ,  (1.16) 

for  i  =  1,  •  •  •  jin  . 

Applying  a  general  function  minjmizer  to  f(x)  may  not  be  the  most 
efficient  way  to  minimize  (1.15)  *  Methods  which  make  use  of  the  individual 
residuals  f^(x)  are  likely  to  be  considerably  more  efficient  than 
methods  which  merely  try  to  minimize  f(x)  without  considering  the 
individual  residuals,  at  least  if  the  minimum  value  of  f(x)  is  close  to 
zero.  Methods  which  make  use  of  th<»  residuals  are  described  in  Barnes 
(1965),  Box  (1966),  Brown  and  "ennits  (1968,  1970,  1971a,  b),  Broyden  (1967, 
1969),  Dennis  (1968,  1969a,  b,  c),  Fletcher  (1968a),  Gauss  (1809), 

Hartley  (1961),  Jones  (1970),  Levenberg  (I9W+),  Marquardt  (1963), 

Matthews  and  Davies  (1969),  Morrison  (1968),  Ortega  (1970),  Ortega  and 
Rheinboldt  (1970),  Peckham  (1970),  Powell  (1965,  1968b,  1969a), 

Babinowitz  (1969),  Rail  (1966,  1969) ,  Schubert  (1970),  Shanno  (1970a), 

Spath  (1967),  Voigt  ( 1969)  t  Wolfe  (I959a),  and  Zeleznik  (1968).  Good 
numerical  methods  for  solving  linear  least  squares  problems  are  also 
relevant:  see  Bjorck  (1967a,  b,  1968),  Businger  and  Golub  (1965), 

Golub  (1965,  1968),  Golub  and  Reinsch  (1970),  Golub  and  Saunders  (1969), 
Golub  and  Wilkinson  (1966),  Jordan  (1968),  Khabaza  (1963),  Maddison  (1966), 
and  Powell  and  Reid  (1968) . 

Let  us  see  why  it  may  be  worthwhile  to  use  the  residuals.  Suppose 
that  we  have  a  good  initial  approximation  to  the  minimum  of  f(x)  ,  so  the 
functions  f^x)  can  be  closely  represented  by  linear  approximations  in 
the  region  of  interest .  To  find  a  linear  approximation  to  f .  (x)  ,  we 

1  M 
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need  to  evaluate  f\(x)  nt  n+1  points,  or  evaluate  f^(x)  and  the 
n  components  of  its  gradient  at  one  point.  Thus,  after  the  same  amount 
of  work  as  is  required  for  n+1  evaluations  of  f(x)  ,  or  one  evaluation 
of  f(x)  and  its  gradient,  the  solution  of  a  linear  least  squares  problem 
gives  an  approximation  to  the  minimum.  This  approximation  is  usually  good 
if  the  minimum  value  of  x  (x)  is  small  (see  Powell  (1965))*  unless  the 
linear  problem  is  very  ill-conditioned.  On  the  other  hand,  if  the  special 
form  (1.15)  of  f(x)  is  disregarded,  then  it  is  necessary  to  evaluate 
f(x)  at  ^  (n+l)(n+2)  points  to  find  in  approximating  quadratic  form. 
(Alternatively,  f  and  its  gradient  must  be  evaluated  at  ^  (n+2)  "I 
or  more  points.)  This  suggests  that  methods  which  disregard  the  special 
form  of  f(x)  are  likely  to  be  much  slower  than  methods  which  use  the 
individual  residuals,  at  least  if  n  is  large.  Empirical  evidence 
supports  this  conclusion  (see  particularly  Table  3  of  Box  (1966)  for 
n  =  20  ),  although  some  of  the  present  methods  which  make  use  of  the 
residuals  appear  to  be  rather  unreliable. 

Despite  our  conclusion,  most  of  the  numerical  examples  given  in 
Section  7  are  of  the  form  (1.15) .  This  is  because  a  particularly  simple 
way  to  construct  test  functions  with  bounded  level  sets  is  to  use  functions 
of  the  form  (1.15),  and  most  of  the  test  functions  giver,  in  the  literature 
have  this  form. 

Some  additional  references 

The  following  general  references  on  function  minimization  and  related, 
topics  have  not  been  mentioned  above:  Abadie  (1970),  Balakrishnan  (1970), 
Bennett  (1965),  Bennett  and  Green  (1966),  Colville  (1968),  Davies  (1969), 
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Davies  and  Swann  (1969),  Bold  and  Eckmann  (lQ70a,  b),  Evans  and  Gould 
(1970)>  Fletcher  (1969a) ,  Hadley  (1964),  King  (1966),  Kunzi,  Tzschach 
and  Zehnder  (1968),  Lavi  and  Vogl  (1966),  Leon  (1966),  Luenberger  (1969a), 
Mangasarian  (1969),  Murtagh  (1969),  Murtagl:  and  Sargent  (1970),  Powexl 
(1966,  1969e) ,  Ralston  and  Wilf  (i960),  Rice  (1970),  Rosen  and  Suzuki 
(1965),  Shah,  Buehler  and  Kempthorne  (1964),  Wolfe  (1963,  1969),  Zadeh 
(1969),  Zangwill  (1969a,  b),  and  Zoutendijk  (19 66). 


2.  The  effect  of  rounding  errors 

Rounding  errors  in  the  computation  of  f(x)  limit  the  accuracy 
attainable  with  any  minimization  method  using  only  the  computed  values 
of  f(x)  .  In  this  section,  we  generalize  the  result -  of  Section  5.2, 
where  the  same  problem  is  considered  for  functions  of  one  variable.  As 
in  Section  5.2,  the  results  of  this  section  do  not  necessarily  apply  to 
methods  which  use  the  gradient  of  f  ,  computed  analytically.  (They  do 
apply  if  the  gradient  is  computed  by  finite  differences.) 

Suppose  that,  in  a  neighbourhood  N  of  a  local  minimum  p  ,  the 
partial  derivatives  f.  (x)  ure  Lipsehrtz  continuous,  i.e.,  for  all 
x,y  €  N  , 

lfij(~)  "  fij^l  -  Mijfe  "  iW  ’ 

where  M.  .  is  a  Lipschitz  constant  (i,j  =  1,  ...,n)  ,  and  any  of  the 
usual  vector  norms  may  be  used.  Since  the  gradient  of  f(x)  vanishes 
at  p  ,  a  simple  extension  of  Lemma  2.3*1  shows  that,  for  xcN  , 
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f(x)  =  f(j i)  +  \  (x  -  p)T  A(x  -  \i)  +  R(x)  ,  (2.2) 

where 

A  =  (fijM)  (2-3) 

is  the  Hessian  matrix  of  f(x)  at  p  ,  and 

|R(x)  j  <  Mjjx  -  jilj3  ,  (2.14-) 

for  some  constant  M  depending  on  n  ,  the  norm  used,  and  the 
Lipschitz  constants  . 

As  in  Section  5*2,  the  best  that  can  be  expected  is  that  the  computed 
value  fl(f(x))  of  f(x)  satisfies  the  nearly  attainable  bound 

fl(f(x))  =  f(x).(l  +  O  (2.5) 

MM  A 

where 

lExl  <  E  .  (2-6) 

and  e  is  the  relative  machine  precision  (see  Section  4.2).  If  f  is 
computed  using  single-precision  arithmetic,  the  error  bound  will  probably 
be  consideraoly  worse  than  this. 

Let  6  be  the  largest  number  such  that,  according  to  equations 
(2.2)  to  (2.6),  it  is  possible  that 

fl(f(p  +  Su))  <  f(p)  ,  (2.7) 

for  some  unit  vector  u  .  Then  it  is  unreasonable  to  expect  any 
minimization  procedure,  based  on  single-precision  evaluations  of  f  ,  to 

A 

return  an  approximation  p  to  p  with  a  guaranteed  upper  bound  for 
j,p  -  pi|  less  than  5  . 
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Let  the  eigenvalues  of  A  be  X,  >  X_  >  . . .  >  X  ,  with  a  set  of 

1-2— -n 

corresponding  normalized  eigenvectors  il,u  ,  ...,u  .  Since  p  is  a 

local  minimum  of  f(x)  ,  certainly 

X  >  0  ,  (2.8) 

n  —  '  ' 


and  we  suppose  that  X^  >  0  .  (The  position  of  the  minimum  is  worse 

M5 

determined  if  X  =  0  .)  If  —  is  small  compared  to  unity,  and 

n 

we  take  u  =  u  ,  then  (2.7)  is  possible  for 


(2.9) 


Thus,  an  upper  bound  on  ||jl  -  p,ll  can  hardly  be  less  than  the  right  side 
of  (2.9). 


The  condition  number 

With  the  assumptions  above,  and  6  given  by  (2.9), 

f(|i  +  6uJ  ~  f (|i)  +  k  e  |f(n)  |  ,  (2.10) 

where 

K  =  \/\  (2.11) 

is  the  usual  condition  number  of  A  .  We  shall  say  that  k  is  the 
condition  number  of  the  minimization  problem  (for  the  local  minimum  p,  ). 

f  M 

The  condition  number  determines  the  rate  of  convergence  of  seme  minimization 
methods  (e.g.,  steepest  descent),  and  it  is  also  important  because  rounding 
errors  make  it  difficult  to  solve  problems  with  condition  numbers  of  the 
order  of  e”1  or  greater  (see  below) . 
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Scaling 


A  change  of  scale  along  the  coordinate  axes  has  the  effect  of 
replacing  the  Hessian  matrix  A  by  SAS  ,  where  S  is  a  positive 
diagonal  matrix.  The  problem  of  choosing  S  to  minimize  the  condition 
number  of  SAS  is  difficult,  even  if  A  is  known  explicitly.  (See 
Forsythe  and  Moler  (1967)  for  the  problem  of  minimizing  the  condition 
number  of  S^i^  ,  where  A  is  not  necessarily  symmetric.)  A  good 
general  rule  is  that  SAS  should  bo  roughly  row  (and  hence  column) 
equilibrated  (see  Wilkinson  (1963,  1965a)).  In  practical  minimization 
problems,  one  difficulty  is  that  little  is  known  about  the  Hessian 
matrix  A  until  a  reasonable  approximation  to  the  minimum 
has  been  found.  This  suggest s  that  a  general  function  minimizer  which 
is  scale-dependent  could  incorporate  an  automatic  scaling  procedure, 
using  current  information  about  A  to  determine  the  scaling.  One  way 
of  doing  this  is  described  in  Section  4. 


3  •  Powell*  s  algorithm 

In  this  section  we  briefly  describe  Powell* s  algorithm  for  minimization 
without  calculating  derivatives.  The  algorithm  is  described  more  fully 
in  Powell  (1964),  and  a  small  error  in  this  paper  is  pointed  out  by 
Zangwill  (1967a).  Numerical  results  are  given  in  Fletcher  (1965)* 

Box  (1966),  and  Kowalik  and  Osborne  (1968) .  A  modified  algorithm,  which 

is  suitable  for  use  on  a  parallel  computer,  and  which  converges  for 

2 

strictly  convex  C '  functions  with  bounded  level  sets,  is  described  by 
Chazan  and  Miranker  (1970) . 
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Powell- s  method  is  a  modification  of  a  quadrat ically  convergent 
method  proposed  by  Smith  (1962) .  Both  methods  ensure  convergence  in  a 
finite  number  of  steps,  for  a  positive  definite  quadratic  form,  by 
making  use  of  some  properties  of  conjugate  directions. 

Conjugate  directions 

If  A  is  positive  definite  and  symmetric,  then  minimizing  the 
quadratic  function 

xTAx  -  2bTx  =  (x  -  A_1b)TA(x  -  A~\)  -  bTA"1b  (3 .1) 

MW  KO  M  M  M  M  M  M  M 

is  equivalent  to  solving  the  system  of  linear  equations 

Ax  =  b  .  (3-2) 

If  the  matrix  A  Is  known  explicitly,  then,  instead  of  minimizing 
(3.1),  we  can  solve  (3.2)  by  any  suitable  method:  for  example,  by  forming 
the  Cholesky  decomposition  of  A.  In  the  applications  of  interest  here, 

A  is  the  Hessian  matrix  of  a  ceHain  function,  and  is  not  known  explicitly, 
but  the  equivalence  of  the  problems  (3-1)  and  (3.2)  is  still  useful. 

Definition  3.1 

Two  vectors  u  and  v  are  said  to  be  conjugate  with  respect  to 
the  positive  definite  symmetric  matrix  A  if 

uTAv  =  0  .  (3.3) 

When  there  is  no  risk  of  confusion,  we  shall  simply  say  that  u 
and  v  are  conjugate.  By  a  set  of  conjugate  directions,  we  mean  a  set 
of  vectors  which  are  pairwise  conjugate. 
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Remark 

If  fu  ,....u  1  is  any  set  of  nonzero  conjugate  directions  in  Rn  , 
\_1  - 

then  un,...,u  are  linearly  independent.  Thus  m  <  n  ,  and  m  =  n  iff 
~1  ~xn 

r,n 

"LX—  y  •  m  •  y  UL  SP&Tl  R  • 

Theorem  3*1 

If  A  is  positive  definite  symmetric,  Ax  =  b  ,  and  {u^  ...,1^} 
is  a  set  of  nonzero  conjugate  directions,  then 


m  u.b 

=  ;£  “  H  rn 

1=1  " 


(3.4) 


is  conjugate  to  each  of  u^. 


Proof 


If  1  <  j  <m  ,  then,  from  (3.b), 


u^Ax*  =  u^(Ax  -  b)  =  0 
<**  •'J  *** 


(5.5) 


Tf  m  =  n  in  Theorem  3.1?  then  xr  =  0  ,  so 


n  utb  \ 

i  =  E  -**=-  U 


(3.6) 


Returning  to  the  minimization  problem,  Theorem  3*1  and  the  equivalence 
of  problems  (3-1)  and  (3.2)  give  the  following  result. 
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Theorem  3*2 


If  A  is  positive  definite  symmetric, 
f(x)  =  x^Ax  -  2b^x  +  c 


(3.7) 


for  some  beRn  and  ceR  ,  and  u^,  ...,11^  Is  a  se^  of  nonzero  conjugate 

directions,  then  the  minimum  of  f(x)  in  the  space  spanned  by  u  ,  ...,u 

m  -*1  J 

occurs  at  the  point  J^p.u.  ,  where 

i=l  1~1 


ta 

A 


(3-8) 


Proof 

This  follows  from  Theorem  3.1,  or,  alternatively,  from  the  relation 


(3-9) 


(cross  terms  vanish  because  of  the  conjugacy  of  u^,  )• 

The  usefulness  of  Theorem  3*2  stems  from  the  following  result, 
which  shows  how  we  can  calculate  the  (3^  of  (3.8)  using  function 
evaluations,  even  if  A  ,  b  and  c  are  not  known  explicitly. 


Theorem  3.3 

With  the  notation  of  Theorem  3*2,  a  fixed  j  satisfying  1  <  j  <  m  , 

and  fixed  Q!_,...,a.  a.  a  ,  the  minimum  of 

1  j-l  j+l  m 


(3.10) 
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i 

Proof 

i 

This  follows  immediately  from  equation  (3.9)* 

\ 

From  Theorems  3.2  and  3*3>  we  see  that  the  minimum  of  the  quadratic 
form  f(x)  can  be  found  by  n  one -dimensional  minimizations  along  nonzero 
conjugate  directions  u^,  ,  and  the  order  of  the  one-dimensional 

minimizations  is  irrelevant.  To  use  this  result,  we  have  to  be  able  to 
generate  sets  of  conjugate  directions.  Both  Powell’s  method  and  Smith’s 
method  do  this  by  using  the  following  theorem,  given  in  Powell  (196h) .  j 

,  I 

Theorem  3.^ 

If  the  minimum  of  f(x)  (given  by  (3*7))  in  the  direction  u  from 
* 

the  point  x^  is  at  x^  ,  for  i  =  0, 1  ,  then  x^  -  xQ  is  conjugate 
to  u  . 

Proof. 

For  i  =  0  and  1  , 

f(xi  +  \u)  =0  at  \  =  0  ,  (3.U) 

so,  frcm  (3-7) > 

1 

uT(Axi  -  b)  =  0  .  (3.12) 

Subtracting  equations  (3*12)  for  i  =  0  and  .1  gives 

? 

uTA(xx  -  xQ)  =  0  ,  (3.13) 

which  completes  the  proof. 
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Powell's  basic  procedure 


We  can  now  describe  the  basic  idea  of  Powell's  algorithm.  Let  x • 
be  the  initial  approximation  to  the  minimum,  and  let  u^,  . . . ,  u^  te 
the  columns  of  the  iientity  matrix.  One  iteration  of  the  basic  procedure 
consists  of  the  following  steps : 

1.  For  i  =  1,  ...,:i  ,  compute  to  minimize  f(x^  ^+p^u^)  * 

and  define  x.  =  x.  ,  +  B.u.  . 

„i  _i-l  Ki^i 

2.  For  i  =  1,  ...,n-l  ,  replace  u^  by  u^+1  . 

3-  Replace  u  by  x  -x„  . 

^n  J  ~n 

Compute  p  to  minimize  f(x^  +  p^)  ,  and  replace  xQ  by  x^+pu^  . 


For  a  general  (non-quadratic)  function,  we  just  repeat  the  iteration 
until  some  stopping  criterion  is  satisfied.  Suppose  that  1  <  k  <  n  , 
and  consider  the  situation  after  the  k-th  iteration.  If  f  is  quadratic 


then  we  can  show,  by  induction  on  k  ,  that  ur  k+^, . . . ,un  are  conjugate. 
This  follows  from  the  choice  of  u  at  step  3 ,  and  Theorem  J.k:  see 
Powell  (I96U).  After  n  iterations,  we  have  minimized  along  n 


conjugate  directions  >  80>  by  Theorems  3*2  and  3.3,  the 

minimum  will  have  been  reached  if  the  u.  are  all  nonzero.  This  is 
true  if,  at  each  iteration,  p^  /  0  ,  for  then  the  directions  u^,  ...,un 
can  not  become  linearly  dependent. 


Droblem  of  linear  dependence 


Unfortunately,  as  pointed  out  by  Zangwill  (1967a),  even  for  a 


quadratic  function  f  one  of  the  iterations  may  have  -  0  ,  which 
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results  in  the  directions  \x.,...,u  becoming  linearly  dependent,  and 

11  ^n 

from  then  on  the  procedure  can  only  find  the  minimum  of  f(x)  over  a 
proper  subspace  of  Rn  .  The  same  is,  of  course,  true  for  non-quadratic 
functions,  and  even  though  it  is  unlikely  that  0^  will  vanish  exactly, 
Powell  discovered  that  the  directions  u^,  . . . ,  u^  often  become  nearly 
linearly  dependent.  Thus,  he  suggests  that  the  new  direction  xr-Xq 
should  be  used,  and  one  of  the  old  u1,...,u  discarded,  only  if  this 
does  not  decrease  the  value  of  jdet(v^  ...  v^)  j  ,  where 


Zi 


1 

/  T.  \  ” 2 
~i^~i  Ui 


(3.14) 


for  i  =  1,  ...,n  .  With  this  modification  the  algorithm  is  quite  successful 
(see  Fletcher  (1965)  and  Box  (1966)  for  a  comparison  with  other  methods), 
but  the  desirable  property  of  quadratic  convergence  is  lost,  for  a  complete 
sot  of  conjugate  directions  may  never  be  built  up.  In  the  next  section, 
we  describe  a  different  way  of  avoiding  the  problem  of  linear  dependence 
of  the  search  directions.  The  numerical  results  given  in  Section  7 
suggest  that  our  method  of  ensuring  linear  independence  may  be  preferable 
to  Powell's. 


4.  The  main  modification 

The  simplest  way  to  avoid  linear  dependence  of  the  search  directions 

with  Powell's  basic  procedure,  and  retain  quadratic  convergence  if  0^  /  0  , 

is  to  reset  the  search  directions  u, ,...,u  to  the  columns  of  the 

„1  _n 

identity  matrix  after,  say,  every  n  iterations.  A  similar  "restarting" 
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device  is  suggested  by  Fletcher  and  Reeves  (1964)  for  their  conjugate 
gradient  method.  Unfortunately,  restarting  tends  to  slow  down  convergence 
for  approximately  quadratic  functions,  because  any  information  built  up 
about  the  function  is  periodically  thrown  away.  (Perhaps  this  is  why 
the  Fletcher-Reeves  algorithm  is  generally  slower  than  the  Davidon- 
Fletcher-Powell  algorithm . ) 

Instead  of  resetting  U  =  [u„,...,u  ]  to  the  identity  matrix,  we 

~x  _n 

could  equally  well  reset  U  to  any  orthogonal  matrix  Q,  .  To  avoid 
discarding  useful  information  about  f  ,  we  could  choose  Q,  so  that, 
if  f  is  quadratic,  u^ .  ..,u  remain  conjugate.  This  suggests  that 
principal  vectors  q^,  •••>qn  should  be  computed  on  the  assumption  that 
f  is  quadratic,  and  U  should  be  reset  to  Q  =  [ q^, . . . ,  q^ ]  .  The 
motivation  for  this  procedure  may  be  summarized  thus: 

1.  If  the  quadratic  approximation  to  f  is  good,  then  the  new  search 
directions  should  be  conjugate  with  respect  to  a  matrix  which  is  close 
to  the  Hessian  matrix  of  f  at  the  minimum,  and  thus  subsequent 
iterations  should  give  fast  convergence. 

2.  Regardless  of  the  validity  of  the  quadratic  approximation,  the  new 
search  directions  are  orthogonal,  so  the  search  for  a  minimum  can  never 
become  restricted  to  a  subspace. 

The  extra  computation  involved 

We  show  below  that  finding  principal  axes  does  not  require  any 
extra  function  evaluations,  but  it  does  involve  finding  an  orthogonal 
set  of  eigenvectors  for  a  symmetric  matrix  H  of  order  n  .  This  requires 


220 


7.4 

3 

about  6n  multiplications,  and  a  similar  number  of  additions,  if  done 

as  suggested  below.  Since  the  principal  axes  are  found  only  once  for 
2 

every  n  linear  minimizations,  and  a  linear  minimization  requires  about 
2.25  function  evaluations  on  the  average  (see  Section  7) >  the  extra 
computation  is  less  than  3n  multiplications  per  function  evaluation. 

We  can  expect  the  evaluation  of  a  nontrivial  function  of  n  variables  to 

2 

require  considerably  more  than  3n  multiplications,  and  possibly  order  n 
so  the  overhead  caused  by  our  modification  is  not  excessive.  Also,  it 
may  be  worth  paying  a  little  for  the  principal  axis  reduction,  for  the 
extra  information  about  f  is  often  of  interest.  For  example,  it 
shows  the  sensitivity  of  f(x)  to  slight  changes  in  x  near  the  minimum. 
The  principal  axes  and  eigenvalues  may  be  of  interest  in  statistical 
problens  when  f  is  minus  the  log-likelihood,  for  then  the  inverse  of 
the  Hessian  at  the  minimum  is  the  sample  variance -covariance  matrix  of 
the  maximum  likelihood  estimates:  see  Nelder  and  Mead  (1965). 

Scaling 

Powell's  modification  of  his  basic  procedure  has  one  feature  which 
ours  lacks:  his  determinantal  criterion  is  independent  of  a  linear 

transformation  of  the  independent  variable  space  (an  important  special 
case  is  a  change  of  scale  for  the  independent  variables) .  This  feature 
is  certainly  desirable,  for  when  a  function  of,  say,  temperature  and 
pressure  is  to  be  minimized,  there  is  no  natural  way  to  scale  the  variables. 
We  should  note,  though,  that  Powell's  algorithm  is  not  completely 
independent  of  linear  transformations  of  the  variable  space,  or  even  of 
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scale  changes,  for  these  influence  both  the  initial  choice  of  the 
vectors  u^,  ...,un  >  and  the  stopping  criterion. 

Finding  the  principal  vectors 
Suppose  that 

f(x)  =  x^Ax  -  2b^x  +  c  (4.1) 

is  a  positive  definite  quadratic  form,  although  A  ,  b  and  c  may  not 
be  known  explicitly.  If  n  iterations  of  Powell’s  basic  procedure  are 
performed  as  described  above,  and  at  each  iteration  0  /  0  ,  then  we 
obtain  n  nonzero  conjugate  directions  u^,  ••*,un  •  Let  U  =  [u^  ...  ur] 
By  the  conjugacy  of  u^,  ...,un  f 

UTAU  =  D  ,  (4.2) 

where  D  is  a  diagonal  matrix  with  positive  diagonal  elements  d^  . 

During  the  last  (i.e.,  n-th  )  iteration,  we  have  performed  one¬ 
dimensional  minimizations  in  the  directions  u1,...,u  .  Consider  a 

~1  _n 

minimization  from  the  point  x  ,  ,  in  the  direction  u.  ,  for 

m1 

1  <  i  <  n  .  We  minimize  the  function 

<P±(a)  =  f(xi-x  +  au^  (*♦•3) 

=  a2ufAu.  +  2a(u?Ax.  .  -u"b  )  +  (xT  ..Ax.  ..  -  2xT  b  +  c)  .  (4.4) 

To  minimize  cp^ (a)  we  fit  a  parabola,  which  necessitates  computing  the 
second  difference  for  three  distinct  points  ^  , 

and  Qt^  .  From  equation  (4.4), 
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vPi[a0,<Va2J  =  >iAui 


(M) 


so  the  diagonal  elements  of  D  are  known  without  any  extra 

computation,  (if  the  quadratic  approximation  to  (a)  is  bad  we  may 
have  (p^a^ct^a^]  <  C  ,  and  then  we  arbitrarily  set  d^  to  a  small 
positive  number.) 


V  =  UD 


(4.6) 


be  the  matrix  with  columns  v±>"m>vn  given  by  (J.l4),  and  let 


H  =  A 


(^•7) 


Since  U  is  nonsingular,  equation  (4.2)  gives 

H  =  UD'V  =  W1  . 


(4.8) 


The  matrix  V  is  easily  computed  from  U  in  n  multiplications  and 
n  square  roots,  but  the  computation  of  W1  is  more  expensive,  and  can 
be  avoided:  see  below. 

Our  aim  is  to  find  the  principal  axes  of  the  quadratic  form  f  , 
i.e.,  to  find  an  orthogonal  matrix  Q,  such  that 


Q  Aft  -  A  , 


(*•9) 


where  a  =  diag(X^)  is  diagonal.  Thus,  the  columns  q^  of  ft  are  just 
the  eigenvectors  of  A  ,  with  corresponding  eigenvalues  \^,  .  ..,X>n  >  and 
we  can  assume  that  >  . . .  >  \  .  The  obvious  way  to  find  Q  and  A 

is  to  compute  H  -  W1  explicitly,  and  then  find  ft  and  A  such  that 


T  -1 

ft  HQ  =  A 


(4.10) 


by  finding  the  eigensystera  of  H  . 
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If  the  condition  number  h  =  A  is  of  order  e  ^  ,  where  e  is 

r  n 

the  relative  machine  precision  (see  Section  4. 2),  then  rounding  errors 
may  lead  to  disastrous . errors  in  the  computed  small  eigenvalues 

...  of  H  ,  and  in  the  corresponding  eigenvectors  •  •  •  > 

even  if  they  are  we  11 -determined  by  V  .  Thus,  it  may  be  necessary  to 
compute  H  ,  and  find  its  eigensystem,  using  double  precision  arithmetic. 
This  difficulty  can  be  avoided  if,  instead  of  forming  H  =  W1  ,  we  work 
directly  with  V  .  Suppose  that  we  find  the  singular  value  decomposition 
of  V  ,  i.e.,  find  orthogonal  matrices  Q  and  Q’  such  that 

QTVQ'  =  Z  ,  (4.11) 

where  Z  =  diag(<ji)  is  a  diagonal  matrix.  (See  Golub  and  Kahan  (1965), 
and  Kogbetliantz  (1955)-)  Then 

A"1  =  QTHQ  =  (QTVQ')(QTVQ,)T  =  ,  (4.12) 

so  Q  is  the  desired  matrix  of  eigenvectors  of  A  ,  and  the  eigenvalues 
are  given  by 

•  (4.15) 

Note  that  the  matrix  Q‘  is  not  required,  and  it  is  not  necessary  to 
compute  YV1  . 

Since  it  is  desirable  that  the  computed  matrix  Q  should  be  close 
to  an  orthogonal  matrix,  we  suggest  that  Q  and  Z  should  be  found  by 
the  method  of  Golub  and  Reinsch  (1970) .  This  involves  reducing  V  to 
bidiagonal  form  by  Householder  transformations,  and  then  computing  the 
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singular  value  decomposition  of  the  bidiagonal  matrix  by  a  variant  of 
the  QR  algorithm. 

Let  us  compare  the  amount  of  computational  work  involved  in 
computing  Q  and  A  via 

1.  The  singular  value  decomposition  (SVD)  of  V  as  described 
above,  and 

2.  Finding  the  matrix  H  and  its  eigensystem,  using  Householder1  s 
reduction  to  tridiagonal  form  and  then  the  QR  algorithm.  (See 
Bowdler,  Martin,  Reinsch  and  Wilkinson  (1968),  Francis  (1962), 
Householder  (1964),  Kublanovskaya  (1961),  Martin,  Reinsch  and 
Wilkinson  (1968),  and  Wilkinson  (1965a,  b,  1968) .) 

For  purposes  of  comparison,  we  count  only  multiplications,  and 

2 

ignore  terms  of  order  n  ,  so  our  conclusions  may  not  be  valid  for  very 
small  n  .  Suppose  that,  in  each  case,  the  QR  process  requires  pn 
iterations,  for  some  modest  number  p  . 

For  method  1,  the  Householder  reduction  requires  4n  /3  multiplica¬ 
tions,  accumulation  of  the  (left-hand)  transformations  requires  another 
* 

4n'/3  multiplications,  and  the  QR  process  with  accumulation  of  the 

5 

transformations  requires  2pn  multiplications,  if  no  splitting  occurs. 

3 

Thus,  method  1  requires  (8+6p)n>/3  multiplications  in  all. 

For  method  2,  the  Householder  reduction  requires  2n'/3  multiplications 

(only  half  as  much  as  for  method  1  because  of  symmetry),  accumulation  of 

the  transformations  requires  2n'/3  multiplications,  and  the  QR  process 
3  3 

requires  2pn  ,  giving  (4+6p)n  /3  altogether.  This  could  be  reduced 

3  2 

to  4n  /3  ,  still  ignoring  terms  of  order  n  ,  if  inverse  iteration  were 

used  to  compute  the  eigenvectors  of  the  tridiagonal  matrix,  but  then  it 
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scbd  should  be  fairly  small  (say  about  ID)  unless  the  axes  are  very 
badly  scaled  initially.  The  automatic  scaling  is  worthwhile,  but  its 
effect  is  not  dramatic,  and  it  is  rather  unreliable,  which  is  the  reason 
for  introducing  scbd  .  Thus,  it  is  still  worthwhile  for  the  user  to 
try  to  scale  his  problem  as  well  as  possible. 


Another  modification 

For  Powell's  basic  procedure  to  minimize  a  positive  definite 
quadratic  form  in  n  iterations,  steps  1  to  3  of  the  first  iteration 
are  unnecessary.  Thus,  our  algorithm  omits  steps  1  to  3  on  the  first 
iteration,  and,  subsequently,  after  each  singular  value  decomposition 
(i.e.,  at  the  (n+l)-st,  (2n+l)-st,  ...  ,  iterations).  For  this  reason. 


there  are  exactly 


1+  (n-l)  (n+l)  =  n 


(1^.16) 


linear  minimizations,  instead  of  n(n+l)  ,  between  each  singular  value 
decomposition.  This  modification  is  not  important  for  large  n  ,  but 
numerical  results  suggest  that  it  is  worthwhile  for  small  n  . 


5.  The  "resolution  ridge"  problem 

Suppose  temporarily  that  we  are  trying  to  maximize  a  function  f(x^,x0) 
of  two  variables  by  an  ascent  method.  Wilde  (19&0  points  out  that 
rounding  errors  in  the  computation  of  f  may  lead  to  premature  termination 
because  of  the  "resolution  ridge"  problem  illustrated  in  Diagram  5*1* 
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Diagram  $«1:  A  resolution  ridge 

Regarding  the  surface  defined  by  f(x^,x2)  as  a  hill,  we  may  reach 
a  point  Xq  ,  situated  on  a  narrow  ridge,  and  then  try  to  proceed  to  a 
higher  point  by  performing  linear  searches  in  certain  directions. 

Suppose,  for  example,  that  we  attempt  linear  searches  in  the  EW  and  NS 
directions.  The  point  x^  may  not  be  at  the  true  minimum  of  f  in  both 
these  directions  but,  because  of  the  effect  of  rounding  errors  in 
evaluating  f  ,  our  one-dimensional  search  procedure  will  only  attempt  to 
locate  the  position  of  maxima  to  within  some  positive  tolerance  5  (see 
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Section  2).  Let  Xg  =  Xq  +  6^  ,  =  xQ  -6e1  ,  x^.  =  xQ+Se2  ,  and 

x  =  x  -  5e  .  As  shown  in  the  diagram,  it  may  happen  that  f  (x  )  is 

wU 

greater  than  each  of  f*( x^)  ,  f(x,,)  >  an(*  f(x^)  >  so  xQ  is 

within  the  tolerance  5  of  local  maxima  in  both  of  the  search  directions, 
even  though  x^  may  be  a  long  way  from  the  true  maximum, which  could  be 
reached  by  climbing  up  the  ridge.  The  same  problem  can  arise  with 

t 

functions  of  more  than  two  variables,  or  when  we  are  looking  for  a 

! 

minimum  rather  than  a  maximum  (then  we  might  speak  of  a  "resolution 
valley"  problem) . 

It  is  clear  from  the  diagram  that,  if  we  know  another  point  x^ 
on  the  ridge,  then  a  linear  search  in  the  direction  x^  -  x^  will  give 
a  point  x^  with  f(x^)  >  ^(xq)  >  unless  the  ridge  is  sharply  curved. 

This  is  the  motivation  for  the  method  suggested  by  Rosenbrock  (i960), 
and  improved  by  Davies,  Swann  and  Campey.  (See  Swann  (1964),  and  also 
Andrews  (1969),  Baer  (1962),  Fletcher  (1965,  1969c,  d),  Osborne  (1969), 

Palmer  (1969) ,  Powell  (1968a),  Rice  (1966),  and  Section  7.) 

j 


Finding  another  point  on  the  ridge  \ 

i 

If  linear  searches  from  the  point  x  fail  to  give  a  higher  point. 


and  a  resolution  ridge  is  suspected,  then  the  following  strategy  may  be 

i 

] 

successful:  take  a  step  of  length,  say  105  ,  in  a  random  direction 

I 

from  Xq  ,  reaching  the  point  x^  .  Then  perform  one  or  more  linear 
searches,  starting  at  x^  ,  and  reaching  the  point  x^  .  As  the  diagram 
shows,  the  point  x^  is  likely  to  be  on  the  ridge,  so  a  linear  search  in 
the  direction  x^  -  x^  may  be  successful. 
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Although  he  does  not  refer  to  the  resolution  ridge  problem, 
Powell  (1964)  incorporates  such  a  strategy  in  his  stopping  criterion. 
We  propose  to  use  this  strategy  during  the  regular  iterations  as  well. 


Incorporating  a  random  step  into  Powell  *  s  basic  procedure 


Suppose  that  we  are  commencing  iteration  k  of  Powell's  basic 

procedure,  counting  either  from  the  start  or  from  the  last  singular 

value  decomposition,  and  2  <  k  <  n  .  To  ensure  quadratic  convergence, 

we  must  search  along  the  directions  u  u  in  step  1  of 

iteration  k  ,  but  the  searches  along  directions  u,  ,...,u  ...  are  not 

°  ~1  „n-k+l 

necessary  for  quadratic  convergence.  (They  are  desirable  for  other 
reasons:  see  Fletcher  (1965)  for  a  comparison  of  Powell's  method  and 
Smith's  method.)  The  quadratic  convergence  property  still  holds  if, 
at  step  1,  we  move  to  any  point 

n 


X 

_n-h+l 


i=l 


(5-1) 


with  /  0  ,  before  performing  linear  searches  in  the  directions 

u  .  Thus,  before  performing  linear  searches  in  directions 

-.n-Kre:  _n 

u^,  ...,un  at  step  1  of  iteration  k  ,  we  may  try  the  random  step  strategy 
as  described  above.  Procedure  praxis  does  this  if  the  problem  appears  to 
be  ill-conditioned,  or  if  the  procedure  is  about  to  terminate  (i.e.,  if 
previous  linear  searches  have  failed  to  find  a  better  approximation  to 
the  minimum) . 

This  modification  is  not  necessary  for  well-conditioned  problems, 
but  numerical  results  show  that  it  is  essential  in  order  to  ensure  that  a 
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good  approximation  to  the  minimum  is  found  for  very  ill-conditioned 
problems.  For  example,  consider  minimizing 

f  (x)  =  xTAx  ,  (5.2) 

where  A  is  a  10  by  10  Hilbert  matrix  (i.e.,  a.  .  =  l/(i+j-l) 

for  1  <  i  ,  j  <  10  ),  with  a  condition  number  of  1.6  x  10^  .  Using 

-13 

long  real  on  an  IBM  360  computer  (machine  precision  16  ■')  ,  and 
starting  from  (1,1,  ...,l)  ,  our  algorithm  successfully  found  the 

position  of  the  minimum  of  f(x)  to  within  the  specified  tolerance 
of  10  ^  ,  but  it  failed  without  the  randan  step  strategy.  (For  further 
details,  see  Section  7*) 


Extrapolation  along  the  ridge 

If  the  function  minimizer  has  been  climbing  a  ridge  for  several 
complete  cycles,  so  the  quadratic  approximation  to  f  is  obviously 
inadequate  (or  the  maximum  would  already  have  been  found),  then  it  may 
be  worthwhile  to  try  an  extrapolation  along  the  ridge.  Suppose  that 
immediately  before  three  successive  singular  value  decompositions,  the  best 
approximations  to  the  maximum  are  x*  ,  x"  ,  and  x''1  ,  with 
dQ  =  (|x*  -x"||2  >0  and  d^  =  ||x"  -xw  ||2  >  0  .  Numerical  tests  indicate 
that  curved  ridges  are  often  approximated  fairly  well  by  the  space-curve 
given  parametrically  by 


M^) 

?(x)  1  W¥  = 


X’  - 


VT- 


..If  i  v  -.11 

-  +  W *2- 


(5.3) 


which  is  chosen  because  x( -d0)  =  x’  ,  x(0)  =  x"  ,  and  x(d^)  =  xw  . 

Hence,  before  the  3rd,  4th,  5th  . . .  singular  value  decompositions, 
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procedure  praxis  (see  Section  9)  moves  to  the  point  x(\q)  ,  where 
is  chosen  to  approximately  minimize  f(x(\))  .  \ ^  is  canputed  by  the 

same  procedure  that  performs  linear  searches. 


6.  Some  further  details 

In  this  section  we  give  some  more  details  of  the  ALGOL  procedure 
given  in  Section  9*  The  criterion  for  discarding  search  directions,  the 
linear  search  procedure,  and  the  stopping  criterion  are  described  briefly. 
(For  the  sake  of  clarity,  some  unimportant  details  are  omitted.) 

The  discarding  criterion 

Suppose  for  the  moment  that  f(x)  is  the  quadratic  form  given  by 

equation  (3.7)*  In  steps  2  and  3  of  Powell's  basic  procedure  (see  Section  3), 

we  effectively  discard  the  search  direction  u^  ,  and  replace  it  by 

xn  -  xQ  .  The  algorithm  suggested  by  Powell  does  not  necessarily  discard 

u^  :  instead,  as  mentioned  in  Section  3>  it  discards  one  of  u^,  •••>un  > 

u  .  =  x  -X.  ,  so  as  to  maximize 

~n+l  ~n  _0 

|det(Vl  ...  vn)|  ,  (6.1) 

where  v.  is  given  by  equation  (3 .1*0,  after  renumbering  the  remaining 
n  directions.  We  wish  to  retain  convergence  for  a  quadratic  form  in 
n  iterations,  so  we  are  not  free  to  discard  any  one  of  u_,...,u  . 

At  the  k-th  iteration,  for  2  <  k  <  n  ,  we  can  discard  any  one  of 

u^, . .  .,un_Jcfl  without  losing  quadratic  convergence  (see  Section  5).  For 
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lack  of  a  better  criterion,  we  choose  to  discard  the  direction,  from 

u-,...,u  .  ,  to  maximize  the  resulting  determinant  (6.1). 

~n-K+  j- 

Suppose  that  the  new  direction  xn-x^  =  u^^  satisfies 


u 


:m-i 


(ViAW 


^  =  i?i  “i 


(6.2) 


Then,  the  effect  of  discarding  and  replacing  it  by  u^  (and  then 

renumbering  the  directions)  is  to  multiply  the  determinant  (6.1)  by  |a^|  , 
so  our  criterion  is  to  choose  i  ,  with  1  <  i  <  n-ktl  ,  so  that  |a  | 
is  at  its  maximum.  If  0^,  ...,0n  are  as  in  the  description  of  Powell* s 
basic  procedure  (see  Section  3)>  and  the  linear  minimization  with  step 
0  u  decreases  f(x)  by  an  amount  A.  ,  then,  from  (3*7), 

Ai  =  PiuiA~i  *  (6*5) 


so  Ja^  /  may  be  used  as  an  estimate  of  (u^Au^^2  .  (if  0^  =  0 

then  we  use  the  result  of  a  previous  iteration.) 

Suppose  that  the  random  step  procedure  described  in  Section  5  moves 
from  xQ  to 


■  t  - 


=  ?0  + 


n 


i=l 


7i?i 


(6A) 


before  the  linear  searches  in  the  directions  u^. 
Then 


n 


~n+l 


=  x 
~n 


Do 


E 

i=x 


(V 


7i)Di 


. .,un  are  performed. 


(6.5) 


and  the  0|  of  equation  (5*1)  are  given  by 
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r  3  +7.  if  1  <  i  <  n-k+1  , 

Pi  =  \  (6.6) 

(.7^  if  n-k+2  <  i  <  n  .  j 

From  (6.2),  (6.3)  and  (6.5), 

(VAt/2  ai  *  Oi  +  yj  J^/KI  «  (6-7) 

so  we  must  discard  direction  u^  ,  1  <  i  <  n-ktl  ,  to  maximize  t,he 

modulus  of  the  right  side  of  (6.7).  Since  this  does  not  explicitly 
depend  on  the  matrix  A  ,  the  same  criterion  is  used  even  if  f  is  not 
necessarily  a  quadratic  form.  Note  that  our  criterion  reduces  to  Powell's 
apart  from  our  restriction  that  i  <  n-k+1  ,  if  there  are  no  random  steps, 
i.e.,  if  7^  =  0  for  i  =  1,  ...,n  .  Quadratic  convergence  is  guaranteed 
(apart  from  the  effect  of  rounding  errors)  unless,  for  some  k  =  2,  ...,n  , 

Pi  =  p2  =  •"  ’  en-kH  =  0  (6‘8) 

at  iteration  k  . 

The  linear  search 

Our  linear  search  procedure  is  similar  to  that  suggested  by  Powell 
(I96U) .  We  wish  to  find  a  value  of  \  which  approximately  minimizes 

<P(M  =  f(xQ  +  \u)  ,  (6.9) 

where  the  initial  point  and  direction  u  /  0  are  given,  and 

<p(0)  =  f(x^)  is  already  known.  If  a  linear  search  in  the  direction  u 
has  already  been  performed,  or  if  u  resulted  from  a  singular  value 
decomposition,  then  an  estimate  of  qp" (0)  is  available.  A  parabola 
P(\)  is  fitted  to  <p(\)  ,  using  cp ( 0)  ,  the  estimate  of  <p"(0)  if 
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avt  :  =»ble,  and  the  computed  value  of  qp(\)  at  another  point,  or  at  two 

points  if  there  is  no  estimate  of  <p"(A.) .  If  P(A-)  has  a  minimum  at 

\  =  A.  ,  and  qp(\  )  <  <p(0)  ,  then  A  is  accepted  as  a  value  of  A.  to 

approximately  minimize  (6.9).  Otherwise  A  is  replaced  by  A  /2  , 

* 

qp(A  )  is  re-evaluated,  and  the  test  is  repeated.  (After  a  number  of 
unsuccessful  tries,  the  procedure  returns  with  A  ^  0  .) 

The  stopping  criterion 

The  user  of  procedure  praxis  provides  two  par  •'meters:  t  (a  positive 
absolute  tolerance),  and  e  (i.e.,  macheps  ,  the  machine  precision) ; 
and  the  procedure  attempts  to  return  x  satisfying 

-  el/2!MI2  +  1  ’  (6-io) 

where  p  is  the  position  of  the  true  local  minimum  near  x  .  The 
exact  form  of  the  right  side  of  (6. 10)  is  not  important,  and  could 
easily  be  changed  if  desired.  It  was  chOBen  because  of  the  analogy  with 
the  one-dimensional  case  (see  Chapter  5). 

It  is  impossible  to  guarantee  that  (6.10)  will  hold  for  all 

2 

functions  f  ,  or  even  for  f  which  are  C  near  p  .  Our  stopping 
criterion  is,  however,  rather  cautious,  and  (6. 10)  is  satisfied  for  all 
numerical  examples  discussed  in  Section  7r  with  the  sole  exception  of 
the  extremely  ill-conditioned  problem 

f (x)  =  xTAx  ,  (6.11) 

where  A  is  a  12  by  12  Hilbert  matrix,  with  a  condition  number 
k  ~  1*7  x  10^  >  e  ^  ~  4  x  10^  .  In  most  cases  the  stopping  criterion 
is  over-cautious,  and  some  unnecessary  function  evaluations  are  performed. 
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Let  us  remark,  as  does  Powell  (1964),  that  the  stopping  criterion  is 
not  an  essential  part  of  our  algorithm,  so  an  improved  criterion  could 
easily  be  incorporated. 

Let  xf  be  the  current  best  approximation  to  the  minimum  before  an 
iteration  of  the  basic  procedure,  and  let  x"  be  the  best  approximation 
after  the  iteration,  i.e.,  n  linear  searches  later.  We  test  if 


(6.12) 


The  stopping  criterion  is  simply  to  stop,  and  return  the  approximation 
x"  ,  if  (6.12)  is  satisfied  for  a  prescribed  nunbei  of  consecutive 
iterations.  The  number  of  consecutive  iterations  depends  on  how  cautious 
we  wish  to  be:  2  is  reasonable,  and  was  used  for  the  examples 
described  in  Section  7*  Because  of  the  random  step  strategy  described 
in  Section  and  always  adopted  if  (6.12)  was  satisfied  on  the  previous 
iteration,  there  is  no  need  for  a  more  complicated  criterion,  such  as 
the  one  used  by  Powell  ( 1964) . 


7 .  Numerical  results  and  comparison  with  other  methods 

The  ALGOL  W  procedure  "praxis",  given  in  Section  9>  has  been  tested 

-13 

on  IBM  360/67  and  360/91  computers  with  machine  precision  16  .  In 

this  section  we  summarize  the  results  of  the  numerical  tests,  and  compare 
them  with  results  for  other  methods  reported  in  the  literature.  Our 
procedure  has  also  been  translated  into  SAIL  (an  extension  of  ALGOL: 
see  Swinehart  and  Sproull  (1970))  and  used  to  solve  least-squares 
parameter -fitting  problems  with  up  to  l6  variables  on  a  PDP  10  computer 
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(machine  precision  2  )  .  The  parameter-fitting  problem  is  described 

in  Sob el  (1970). 


I  ) 

!  Table  7*1  summarizes  the  performance  of  procedure  praxis  on  the 

|  _5 

test  functions  described  below.  In  all  cases  the  tolerance  t  =  10 

!  i 

-13 

and  macheps  =  1 6  .  The  table  gives  the  number  of  variables,  n  ; 

|  I 

the  initial  step-size  (a  rough  estimate  of  the  distance  to  the  minimum),  i 

I 

h  ;  and  the  starting  point,  x~  .  So  that  the  results  can  be  compared 

I  ■ 

with  those  of  methods  with  a  different  stopping  criterion,  we  give  the 

I 

number  nf  of  function  evaluations,  and  the  number  n^  of  linear 
searches  (including  any  parabolic  extrapolations),  required  to  reduce 
f(x)  -  f(n)  below  10  ,  where  f(p)  is  the  true  minimum  of  f  . 

As  f(x)  was  only  printed  out  after  each  iteration  of  the  basic  procedure, 

/v 

F 

i.e.,  after  every  n  linear  minimizations,  the  number  of  function 

J  I- 

evaluations  required  to  reduce  f  (x)  -  f(p)  to  10  is  often  slightly 
less  than  n  ,  so  we  also  give  the  actual  value  of  f(x)  -  f  (n)  after 
1  nf  function  evaluations.  Finally,  the  table  gives  k  ,  the  estimated 

condition  number  of  the  problem.  Except  for  the  few  cases  where  it  is 
I  easily  found  analytically,  k  is  estimated  from  the  computed  singular 

I  values,  and  may  be  rather  inaccurate. 

< 

For  those  examples  marked  with  an  asterisk,  the  random  step  strategy 
was  used  from  the  start,  (in  the  initialization  phase  of  procedure 
praxis,  the  variable  "iHc"  was  set  to  true.)  For  the  other  examples 
l  the  procedure  was  used  as  given  in  Section  9  (with  "illc”  set  to  false 

initially).  Although  the  automatic  scaling  feature  (see  Section  4) 

j 

reduces  n^.  by  about  25  percent  for  some  of  the  badly  scaled  problems, 

1  this  feature  was  switched  off  for  the  examples  given  in  the  table.  (The 

bound  "scbd"  of  equation  (4.15)  was  set  to  1  .) 

I 
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Definitions  of  the  test  functions,  and  comments  on  the  results 
summarized  in  Table  7*1*  are  given  after  the  table. 

A  cautionary  note 

When  comparing  different  minimization  methods,  such  as  ours, 

Powell’s  and  Stewart’s,  the  reader  should  not  forget  that  the  numerical 
results  reported  for  the  methods  may  have  been  obtained  on  different 
computers  (with  different  word-lengths),  and  with  different  linear  search 
procedures.  The  effect  of  different  word-lengths  should  only  be 
significant  in  the  final  stages  of  the  search,  when  rounding  errors 
determine  the  limiting  accuracy  attainable,  except  for  ill-conditioned 
problems  (say  k  >  10  )  .  This  is  another  reason  why  we  prefer  to 
consider  the  number  of  function  evaluations  required  to  reduce  f(x)  -  f(p) 
to  a  reasonable  threshold  (say  10-1^)  ,  rather  than  the  number  required 
for  convergence. 

Because  apparently  minor  differences  in  the  linear  search  procedures 
can  be  quite  important,  Fletcher  (1965)  prefers  to  consider  the  number 
of  linear  searches,  ,  instead  of  the  number  of  function  evaluations, 

nf  .  This  approach  discriminates  against  methods,  such  as  Powell's, 
which  use  most  of  the  search  directions  several  times,  and  can  thus  use 
second  derivative  estimates  to  reduce  the  number  of  function  evaluations 
required  for  the  second  and  later  searches  in  each  direction.  Note  that, 
for  the  examples  given  in  Table  7 .1,  n^/n^  lies  between  2.1  2.7  , 

but  it  would  be  at  least  3.0  for  methods  which  do  not  use  second 
derivative  information,  if  the  linear  search  involves  fitting  a  parabola 
and  evaluating  X  at  the  minimum  of  the  parabola.  Also,  there  are 
premising  methods  which  do  not  use  linear  searches  at  all  (see  Broyden  (1967 
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Davidon  (1968,  1969),  Goldstein  and  Price  (1967),  and  Powell  (1970e)), 
and  these  methods  could  presumably  be  adapted  to  accept  difference 
approximations  to  derivatives.  Thus,  we  prefer  to  compare  methods  on 
the  basis  of  the  number  of  function  evaluations  required,  and  regard 
the  linear  search  procedure,  if  any,  as  an  integral  part  of  each  method. 


Table  7.1:  Results  for  various  test  functions 


Function 

n 

h 

xT 

*0 

nf 

nl 

f(x)-f(»0 

K 

|  Rosenbrock 

2 

1 

(-1.2,1) 

120 

47 

6.61* -18 

2508 

Rosenbrock 

2 

3 

(3,3) 

110 

42 

8.53’ -17 

2508 

Rosenbrock 

2 

12 

(8,8) 

181 

67 

9. 71’ -18 

2508 

Cube 

2 

1 

(-1.2, -1) 

177 

68 

7.18’ -18 

10018 

Beale 

2 

1 

(0.1, 0.1) 

54 

22 

2.00' -15 

162 

Helix 

3 

1 

(-1,0,0) 

155 

67 

1.75’-11 

500 

Powell 

3 

1 

(0,1,2) 

55 

23 

1.99* -11 

28 

Box* 

3 

20 

(0,10,20) 

100 

37 

2.57’ -13 

8300 

Singular* 

n 

1 

(3, -1,0,1) 

234 

106 

9.76’ -11 

OO 

Wood* 

H 

10 

-(3, 1,3,1) 

1+52 

191 

6.o6'-l4 

1400  ; 

Chebyquad 

0.1 

xi  =  i/(n+l) 

31 

12 

7. 89* -20 

1.3 

Chebyquad 

D 

0.1 

=i/(n+l) 

74 

32 

7.89’ -11 

7 

Chebyquad 

6 

0.1 

xj[  =i/(n+l) 

223 

101 

7. 00 » -13 

50 

Chebyquad 

8 

0.1 

Xj  =  i/  (n+1) 

326 

147 

6.32’ -11 

200? 

Watson* 

6 

1 

0T 

316 

145 

2.83' -12 

86000 

Watson* 

9 

1 

? 

n84  j 

3.18’ -11 

Eh 

*  For  these  results  we  set  illc  :=  true  in  the  initialization 
phase  of  procedure  praxis,  and  the  random  number  generator  was 
initialized  by  calling  raninit(2)  in  procedure  test. 
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Table  7*1  continued 


Function 

n 

h 

T 

5o 

nf 

“l 

f(x)-f(ji) 

H 

Tridiag 

■ 

8 

0T 

27 

11 

0 

29.3 

Tridiag 

E 

12 

T 

0 

51 

22 

0 

64.9 

Tridiag 

8 

16 

T 

0 

126 

55 

0 

113 

Tridiag 

10 

20 

T 

0 

201 

89 

1.56* -15 

175 

Tridiag 

12 

24 

T 

0 

259 

118 

2.23* -15 

250 

Tridiag 

16 

32 

T 

0 

488 

222 

1.26* -13 

438 

Tridiag 

20 

40 

T 

0 

805 

379 

0 

677 

Hilbert 

2 

10 

11 

4 

3.98’ -15 

19 

Hilbert 

4 

10 

(1,...,1) 

50 

22 

6.11* -15 

1.5*4 

Hilbex't 

6 

10 

(1,...,1) 

133 

58 

1.50* -11 

1*5*7 

Hi  lbert 

8 

10 

( 1#  •  •  •  i  l) 

262 

119 

8.i4*-n 

1.5*10 

Hilbert + 

10 

10 

(1,...,1) 

592 

267 

7.84* -LI 

1.6*13 

Hilbert + 

12 

10 

(ij  •  •  •  >  l) 

731 

328 

1.98*-ll 

1.7*16 

+  For  these  results  the  stopping  criterion  was  more  conservative: 
we  set  ktm  :=  4  in  the  initialization  phase  of  procedure  praxis. 
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Definitions  of  the  test  functions  and  comments  on  Table  7*1 

Rosenbrock  (Rosenbrock  (i960)): 

f(x)  =  100(x2 -x^)2  -i  (1-x^2  .  (7-1) 

This  is  a  well-known  function  with  a  parabolic  valley.  Descent  methods 
tend  to  fall,  into  the  valley,  and  then  follow  it  around  to  the  minimum 
at  (1,1)  .  Details  of  the  progress  of  the  algorithm,  for  the  starting 
point  (-1.2,  1)  ,  are  given  in  Table  7*2.  In  Diagram  7*1  we  compare 
these  results  with  those  reported  for  Stewart's  method  (Stewart  (1967)), 
Powell's  method,  and  the  method  of  Davies,  Swann  and  Campey  (as  reported 
by  Fletcher  (1965)).  The  graph  shows  that  our  method  compares  favourably 
with  the  other  methods.  Although  the  function  (7*1)  is  rather  artificial, 
similar  curved  valleys  often  arise  when  penalty  function  methods  are  used 

to  reduce  constrained  problems  to  unconstrained  problems:  consider 

2  2 
minimizing  (1  -  x^  ,  with  the  constraint  that  x?  =  x^  ,  by  a  simple- 

minded  penalty  function  method. 

Cube  ( Leon  ( 1966) ) : 

f(x)  =  100(x2-x^)2+  (1-x^2  .  (7.2) 

This  function  is  similar  to  Rosenbrock* s,  and  much  the  same  remarks 

3 

apply.  Here  the  valley  follows  the  curve  x2  =  x^  . 

Beale  (Beale  (1958)): 

5  <  2 

f(x)  =  £  (c  -X  (1-xb)  ,  (7.3) 

i=l  1  x  * 
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where  =  1.5  ,  c2  =  2.25  *  =  2.625  •  This  function  has  a  valley 

approaching  the  line  xg  =  1  ,  and  has  a  minimum  of  0  at  (3,  -)  . 
Kowalik  and  Osborne  (1968)  report  that  the  Davidon -Fletcher -Powell 
algorithm  reduced  f  to  2.18X10-11  in  20  function  and  gradient 
evaluations  (equivalent  to  60  function  evaluations  if  the  usual  (n+1) 

weighting  factor  is  used),  and  Powell’s  method  required  86  function 

-8 

evaluations  to  reduce  f  to  2.9^xl0~°  .  Thus,  our  method  compares 
favourably  on  this  example. 


Helix  (Fletcher  and  Powell  (1963)): 

f(x)  =  100((x3-10e)2+(r-l)2)  +  x^  , 


where 


,  2  A  2x1/2 

r  =  (xi  +  Xg)  ' 


and 


2tt0  = 


arctan(Xg/x1)  if  x1  >  0  , 

tt  +  arctan(Xg/x1)  if  x^  <  0  . 


(7  A) 


(7.5) 


(7.6) 


This  function  of  three  variables  has  a  helical  valley,  and  a  minimum 
T 

at  (1,0,0)  .  The  results  are  given  in  more  detail  in  Table  7*3  and 
Diagram  7*2.  For  this  example  our  method  is  faster  than  Powell's 
method,  but  slightly  slower  than  Stewart's. 
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Powell  (Powell  (1964)): 


f(x)  =  5 


(l+^-Xg)2) 


sin (§  x2x3)  -  exp 


.(7-7) 


For  a  description  of  this  function,  see  Powell  (1964) .  Perhaps  by  good 
luck,  our  procedure  had  no  difficulty  with  this  function:  it  found  the 
true  minimum  quickly  and  did  not  stop  prematurely. 


Box  (Box  (1966)): 


f(x) 


10 

E 


i=l 


(exp(-ix1/lO)  -  exp(-ix2/l0)) 
-x3(exp(-i/lO)  -  exp(-i) ) 


(7.8) 


T 

This  function  has  minima  of  0  at  (1,  10,  1)  ,  and  also  along  the 
line  {(^AjO)11}  .  (Our  procedure  found  the 
and  Osborne  (1968)  report  that  Powell's  method  took  205  function 
evaluations  to  reduce  f  to  3.09 x  10-^  ,  so  our  method  is  about  twice 
as  fast .  Our  method  took  79  function  evaluations  to  reduce  f  to 

_7 

2.29x10  ,  so  it  is  faster,  in  this  example,  than  any  of  the  methods 

compared  by  Box  (1966),  with  the  exception  of  Powell's  method  for  sums 
of  squares  (Powell  (1965)).  See  the  comment  in  Section  1  about  special 
methods  for  minimizing  sums  of  squares'. 


first\ni  nimum.) 


Kowalik 


Singular  (Powell  (1962)): 

f(x)  =  (xi+ !°x2)2+ 5(x3 -x4)2+ (x2 -2x5)I++ 10(xi -x4)4  .  (7.9) 
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This  function  is  difficult  to  minimize,  and  provides  a  severe  test  of 
the  stopping  criterion,  because  the  Hessian  matrix  at  the  minimum 
(x  =  0)  is  doubly  singular.  The  function  varies  very  slowly  near  0 
in  the  two-dimensional  subspace  -X^,  X-2,  X.^}  ■  Table  7.4 

and  Diagram  7*3  suggest  that  the  algorithm  converges  only  linearly, 
as  does  Powell's  algorithm.  It  is  interesting  to  note  that  the  output 
frcm  our  procedure  would  strongly  suggest  the  singularity,  if  we  did  not 
know  it  in  advance:  after  219  function  evaluations,  with 
f(x)  =  7.67  x  10  ^  ,  the  computed  eigenvalues  were  101.0  ,  9-999  , 
0.003790  >  and  0.001014  .  (The  exact  eigenvalues  at  0  are  101  ,  10  , 

0  ,  and  0  .)  After  384  function  evaluations,  with  f(x)  reduced  to 

-17  ~7 

1.02x10  ,  the  two  smallest  eigenvalues  were  1.56x10  and 

-8 

5.98x10  .  Thus,  our  procedure  should  enable  singularity  of  the 

Hessian  matrix  to  be  detected,  in  the  unlikely  event  that  it  occurred 
in  a  practical  problem,  (For  one  example,  see  Freudenstein  and  Roth 
(1963).) 

Wood  (see  Colville  (1968)): 

f(x)  =  100(x2-x^)2  :  (1-x^2  +  90(xlf-x2)2  +  (l-x^2  + 

10.1[(x2-l)2  +  (x4-l)2]  +  19.8(x2-1)(x4-1)  .  (7-10) 

This  function  is  rather  like  Rosenbrock' s,  but  with  four  variables 
instead  of  two.  Procedures  with  an  inadequate  stopping  criterion  may 
terminate  prematurely  on  this  function  (see  McCormick  and  Pearson  (1969)), 
but  our  procedure  did  find  the  minimum  at  n  =  (1,1,1, if  . 
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Chebyquad  (Fletcher  (1965)): 

f(x)  is  defined  by  the  ALGOL  procedure  given  by  Fletcher  (1965)  • 

As  the  minimization  problem  is  still  valid,  we  have  not  corrected  a 
small  error  in  this  procedure.  (The  procedure  does  not  compute  exactly 
what  Fletcher  intended.)  In  contrast  to  most  of  our  other  test  functions, 
which  are  designed  to  be  difficult  to  minimize,  this  function  is  fairly 
easy  to  minimize.  For  n  =  1(1)7  and.  9  the  minimum  is  0  ,  for  other 
n  it  is  nonzero.  (For  n  =  8  it  is  approximately  0.00351687372568  .) 
The  results  given  in  Table  7*5*  and  illustrated  in  Diagrams  7*^  to  7-7> 
show  that  our  method  is  faster  than  those  of  Powell  or  of  Davies,  Swann 
and  Campey,  but  a  little  slower  than  Stewart’s. 


Watson  (see  Kowalik  and  Osborne  (1968)): 
f  (x)  =  x^  +  (x2  -x^  -  l)2  + 


it  .  jt  0'1)xj(i291)3  '  (  li  )  ‘  '] 


(7.11) 


Here  a  polynomial 


p(t)  =  x1  +  Xgt  + 


.  .  +  X  t 

n 


n-1 


(7.12) 


is  fitted,  by  least  squares,  to  approximate  a  solution  of  the 
differential  equation 

dz/dt  =  1  +  z^  , 


(7.15) 


with  z(0)  =  0  ,  for  t  €  [0,1]  .  (The  exact  solution  is  z  =  tan(t)  .) 

ri"*l 

Because  of  a  bad  choice  of  basis  functions  (l,t,  ...,t  }  ,  the 
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minimization  problem  is  ill-conditioned,  and  rather  difficult  to  solve. 
For  n  =  6  ,  the  minimum  is  f(n)  ~  2.28767005355x10*^  ,  at 

n  ~  (-0.015725,  1.012435,  -0.232992,  1.260430,  -1.513729,  0. 992996) T  . 

For  n  =  9  ,  f(p)  ~  1.399760138  x  10*6  ,  and  jx  ~  (-0.000015,  O.999790, 
0.014764,  0.146342,  1.000821,  -2.617731,  4.104403,  -3.143612,  l.052627)T 

(We  do  not  claim  that  all  the  figures  given  are  significant.) 

Kowtlik  and  Osborne  (1968)  report  that,  after  700  function 

* 

evaluations,  Powell's  method  had  only  reduced  f  to  2.434x10 
(for  n  =  6)  ,  so  our  method  is  at  least  twice  as  fast  here.  The 
Watson  problem  for  n  =  9  is  very  ill-conditioned,  and  seems  to  be  a 
good  test  for  a  minimization  procedure. 


Tridiag  (see  Gregory  and  Karney  (1969),  pp.  4l  and  74): 

T 

f(x)  =  x  Ax  -  2x1  , 

where 


(7. 1*0 


(7-15) 


This  function  is  useful  for  testing  the  quadratic  convergence  property. 
The  minimum  f(n)  =  -n  occurs  when  p.  is  the  first  column  of  A  1  ,  i. 
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The  results  given  in  Table  7*1  show  that,  as  expected,  the  minimum  is 

2 

found  in  n  or  less  linear  minimizations.  The  eigenvalues  of  A  are 


just  =  4  cos  (^_) 


for  j  =  1, . . . ,  n  . 


Hilbert 

f(x)  =  xTAx  ,  (7-17) 

where  A  is  an  n  by  n  Hilbert  matrix,  i.e., 

a  =  l/(i+j-l)  (7.18) 

X  J 

for  1  <  i  ,  j  <  n  .  f(x)  can  be  computed  directly  without  storing 
the  matrix  A  .  Like  (7.14),  (7*17)  is  a  positive  definite  quadratic 

form,  but  the  condition  number  increases  rapidly  with  n  .  Because  of 

2 

the  effect  of  rounding  errors,  more  than  n  linear  minimizations  were 

required  to  reduce  f  to  10 ,  except  for  n  =  2  .  The  procedure 

successfully  found  the  minimum  p  =  0  ,  to  within  the  prescribed 

tolerance,  for  n  <  10  .  For  n  =  12  ,  some  components  of  the  computed 

minimum  were  greater  than  0.1  ,  even  though  f  was  reduced  to 
.  -iq 

2-76x10  "  .  This  illustrates  how  ill-conditioned  the  problem  is! 


Some  more  detailed  results 

Tables  7*2  to  7*5  give  more  details  of  the  progress  of  our  procedure 
(B)  on  the  F.osenbrock,  Helix,  Singular,  and  Chebyquad  functions.  In 
Diagrams  7*1  to  7-7>  ve  plot 
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A  =  log10(f(x)  -f(ji))  (7.19) 

against  n^,  ,  the  number  of  function  evaluations.  Using  the  results 
given  by  Fletcher  (1965)  and  Stewart  (1967),  the  corresponding  graphs 
for  the  methods  of  Davies,  Swann  and  Campey  (D),  Powell  (P),  and 
Stewart  (S),  are  also  given,  for  purposes  of  comparison. 


Table  7*2:  Rosenbrock 


n 

I 

nl 

f(x) 

X1 

x2 

1 

0 

2 . 42 ' 1 

:  -1.200000 

1 

1.000000 

u 

4 

4.14*0 

-1.034611 

1.071270 

21 

8 

3.42*0 

-0.811598 

0.621199 

31 

12 

2.59*0 

1 

'.549031 

0.258076 

j 

45 

17 

1.67*0 

-0.268211 

0.046503 

58 

22 

i.07'0 

-0.028125 

-0.010783 

72 

27 

3.71'-1 

0.482692 

0.200894  1 

84 

32 

2.79' -3 

0.947231 

0.897130 

98 

37 

5.89»-4 

0.996384 

0.990382 

109 

42 

6.69* -9 

0.999991 

0.999974 

120 

47 

6.61* -18 

1.000000 

1.000000 

132 

52 

1.13* -23 

1.000000 

1.000000 

1  155 

L_ . . 

57 

4.47»-24 

__  .  1 

1.000000 

1.000000 
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1 

0 

2.50*3 

:  -1.000000 

1 

0.000000 

0.000000  ! 

14 

5 

1.62*2 

j  1.000000 

2.000000 

2.000000 

23 

9 

1.18*2 

0.563832 

1.952025 

1.759493 

36 

ll+ 

5.22*0 

!  0.311857 

1.000020 

2.096124 

1+4  i 

I 

18 

i+.oi+*o 

0.305534  . 

: 

0.31+7506 

0.967190 

1.987145 

57  : 

23 

3.78*0 

0.907981 

1.922708 

65  ! 

27 

3.01*0 

0.81+7973 

0.734103 

1.074593 

82 

33 

9-1+6* -1 

0.816717 

0.566910 

0.969820 

91 

37 

3.66* -1 

0.96573*+ 

0.342023 

0  '-'  "844 

105 

1+3 

2.1+6’ -1 

1.004621+ 

0.239418 

0.364506 

n3  ; 

1+7 

2.81+' -2 

0-99384? 

0.091699 

0.153178 

126  . 

53 

6.35'-3 

1.002319 

0.045726 

0.072132 

134  | 

57 

8.01' -1+ 

1.002726 

0 .002303 

0.002966  ; 

ll+7 

63 

8.66* -6 

0.999996 

0.001853 

0.002942 

155 

1 

67 

1.75' -11 

1.000000 

8.49*-9 

2.47* -7  j 

169  i 

73 

1.12' -20 

1.000000 

-6.45' -11 

-9.92' -n  ; 

I 

178 

77 

I.99' -2l+ 

000000 

-1.69* -13 

-2.47* -13  ! 

200  i 

_ ±. 

83 

1.94* -2l+ 

. . . 

1.000000 

-1.60* -13  . 

.. .  -  i 

-2.53* -13 

21+9 


1  T 
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Table  T •h:  Singular* 


nf 

nl 

r"  f(x)  . 

nf 

L  ..... 

f(x) 

1 

0 

- ; 

2.15*2 

234 

106 

9.76'-H 

19 

6 

1.18*1  I 

244 

111 

2 .03* -12 

31 

11 

7.96'0  : 

254 

116 

4.11»-13 

42 

16 

7-75*0 

269 

123 

2.6l*-l4 

58 

22 

2.94*0 

279 

128 

6.43* -15 

i 

68 

1 

27 

9.86'-l 

289 

133 

J 

8.88' -16 

1 

78 

32 

1.34' -1 

308 

l4o 

J 

7 .35' -16  ; 

( 

94 

38 

6.92'-3  ! 

1 

319 

145 

3.87* -16 

lo4 

43 

1.18* -3 

330 

150 

9-92* -17  j 

114 

48 

5-25  * -5  ; 

358 

157 

9.92* -17  j 
< 

129 

55 

J 

8.25'-6  i 

; 

373 

162 

1.65* -17 

139 

60 

2.l3'-6  i 

1 

384 

167 

1.02* -17 

149 

65 

2. 70 '-7 

4o4 

174 

9-95* -18 

i 

164 

72 

7-91' -8  ; 

421 

179 

6.02* -23  j 

174 

77 

3-95  * -8  ! 

436 

184 

5-89* -23 

184 

82 

3-90’-8  ■ 

464 

191 

5-89* -23 

199 

89 

3-90*-8 

436 

1$6 

5. 89* -23 

209 

94 

3 .89' 

219 

99 

7.67* -9 

. 

'T 


~(-9-73xlo'7  ,  9-73  xio"8  ,  5-31  xio"7  ,  5-31  xio-7) 


lying 


approximately  in  the  subspace  {(10^,  -4  ,  V,,  ^2)}  ,  as  expected. 


*  See  the  comment  under  Table  7*1* 
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nf 

f 

1 

l 

0 

12 

b 

22 

8 

51 

12 

45 

17 

73 

22 

1 

0 

]  4 . 64 ' -2 

131 

58  | 

2.lU»-5 

N~\ 

OJ 

8 

2.35'-2 

145 

65 

1 . 14 ' - 5 

37 

15 

1 

1.80»-2 

159 

72 

2.71* -6 

1 

3i  ; 

22 

1.21* -2 

181 

80 

1.13' -7 

66  i 

29 

5.69'  -5  ! 

1 

195 

87 

6. 59' -10 

81  ! 

i 

36 

1 

1  2 .07* -3  i 

1  1 

209 

94 

1.38' -10 

i 

03  ! 

1 

bk 

9-89' -5 

223 

101 

7. 00 '-13 

*  i 

51 

3.47* -5 

1 

238 

_ 

108  , 

L _ 2  .  . 

3*77' -15 

(0.066877, 

0.288741,  0.366682,  0.6 3 

>3318, 

0.711259, 

0.933123) 

Table  7«5  continued 


r 

\ 

n  =  8 


( 

nl  ; 

f(x) 

nf 

nl 

f(x) 

— 

• 

... 

► - 

... 

i 

1 

0 

0.0386176982859 

208 

92 

O.OO35269968747 

29 

1 

1 

10 

0.0171124413073 

226 

101 

0.0035191392494 

j 

47 

19 

0.0109131815974 

244 

no 

i 

0.0035180637576 

|  « 

28 

0.0102860269896 

262 

119 

0.0035176364629 

‘  83 

37 

0.0093337335931 

280 

128 

j 

0.0035171964541 

| 

i  102 
! 

46 

0.0071908595069 

308 

138 

0.0035168743745 

1 

!  125 

55 

0.0049952481593 

326 

147 

0.0035168737890 

144 

64 

0.0044432513463 

345 

156 

0.0035168737290 

;  172 

i 

l  190 

74  i 

83  | 

0. 00379^1612  5 

0.0035390722159 

364 

165 

0.0035168737288 

ii 

(0.043153, 

0.806910, 

0.193091,  0.266329,  0.500000, 

0.956847) 

0.500000,  0.733671, 

■am  7*1 »  Rosenbrock 


Key:  B: 


D: 


P: 


S: 


Our  method, 

The  method  of  Davies,  Swann  and  Campey, 
as  given  by  Fletcher  (1965)* 

Powell’s  (1964)  method,  as  given  by  Fletcher  (1965) 
Stewart's  method,  as  given  by  Stewart  (1967). 


log10(f(x)  -  f(ji)) 


:  Helix 


Key:  B:  Our  method, 

D:  The  method  of  Davies,  Swann  and  Campey,  as  given 
by  Fletcher  (1965), 

P:  Powell's  (19&0  method,  as  given  by  Fletcher  (1965), 
S:  Stewart's  method,  as  given  by  Stewart  (1967) . 

(*(*>-  fW 


7 *3:  Singular  (Powell’s  function  of  four  variables) 


Key:  B:  Our  method, 


D:  The  method  of  Davies,  Swarm  and  Carapey,  as 
given  by  Fletcher  (1965), 

P:  Powell’s  (I96M  method,  as  given  by  Fletcher  (1965) 
S:  Stewart's  method,  given  by  Stewart  (1967) . 


log,„(f(x)  -f(|i)) 


gram  7.4: 


Chebyquad, 


n  =  2 


Key: 


B :  Our  method, 

D:  The  method  of  Davies,  Swann  and  Campey,  as  given 
by  Fletcher  (1965), 

P:  Powell's  (1964)  method,  as  given  by  Fletcher  ( 1965) 
S:  Stewart's  method,  as  given  by  Stewart  (1967)  . 


A  -  login(f(x)  -  f(n)) 


Key:  B: 
D: 


P: 

S: 


Our  method, 

The  method  of  Davies,  Swann  and  Carapey,  as  given 
by  Fletcher  (1965), 

Powell's  (196^)  method,  as  given  by  Fletcher  (1965) 
Stewart's  method,  as  given  by  Stewart  (1967). 


log1A(f(x)  -f(p)) 
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Diagram  7  »7 • 
(Results 


Chebyquad,  n  =  8 

for  Stewart's  method  not  available.) 

Key:  B:  Our  method, 

D:  The  method  of  Davies,  Swann  and  Carapey,  as  given 
by  Fletcher  (1965), 

P:  Powell's  (19&0  method,  as  given  by  Fletcher  (1965) 


A  =  log  (f(x)  -  f (iu) ) 


7-8 

8.  Conclusion 


Powell.  (1964)  observes  that,  with  his  determinant al  criterion  for 
accepting  new  search  directions  (see  Section  3),  there  is  a  tendency  for 
the  new  directions  to  be  accepted  less  often  as  the  number  of  variables 
increases,  and  the  quadratic  convergence  property  of  his  basic  procedure 
is  lost.  Our  aim  was  to  avoid  this  difficulty,  keep  the  quadratic 
convergence  property,  and  ensure  that  the  search  directions  continue  to 
span  the  whole  space,  while  using  basically  the  same  method  as  Powell 
(and  Smith  (1962))  to  generate  conjugate  directions. 

The  numerical  results  given  in  Section  7  suggest  that  our  algorithm 
is  faster  than  Powell’s,  and  comparable  to  Stewart’s,  if  the  criterion 
is  the  number  of  function  evaluations  required  to  reduce  f(x)  to  a 
certain  threshold.  Also,  our  algorithm  seems  to  be  reliable  ever  for 
very  ill-conditioned  problems  like  Watson  (n  =  9)  and  Hilbert  (n  -  10)  , 
while  Stewart’s  method  breaks  down  because  of  numerical  difficulties  on 
some  functions,  e.g.,  the  Rosenbrock  and  Singular  functions  (see 
Stewart  (1967))  •  However,  we  should  not  try  to  conclude  too  much  frcm 
the  numerical  results:  see  the  warning  in  Section  7* 

Theoretical  convergence  results 

Suppose  that  all  arithmetic  is  exact  (i.e.,  there  are  no  rounding 
errors),  and  consider  our  algorithm  with  the  stopping  criterion  removed. 
Since  the  algorithm  keeps  on  performing  linear  searches  along  n 
orthogonal  directions,  the  same  conditions  that  ensure  convergence  of 
the  method  of  coordinate  search  to  a  local  minimum  will  ensure  convergence 
of  our  algorithm.  In  particular,  the  algorithm  will  converge  to  the 
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7-9 

2 

(unique)  minimum  for  all  functions  f  which  are  C1*  ,  strictly  convex, 
and  satisfy 

lira  f(\e)  =  +  *  (8.1) 

K  -*  00 

for  all  nonzero  vectors  e  .  Of  course,  this  result  is  of  little 
practical  interest,  for  in  practice  rounding  errors  may  be  very 
important:  see  Section  5* 

It  is  plausible  that,  if  the  Hessian  matrix  of  f  is  strictly 
positive  definite  at  the  minimum,  then  our  algorithm  will  converge 
super  linearly.  McCormick  (1969)  shows  that  this  is  true  for  the  reset 
Davidon-Fletcher -Powell  algorithm,  provided  a  Lipschitz  condition  is 
satisfied.  Figures  7.1,  7.2,  and  J.h  to  7*7  certainly  suggest  that 

convergence  is  super linear  until  rounding  errors  become  Important.  We 
do  not  have  a  proof  of  this  conjecture  though:  perhaps  additional 
conditions  on  f  ,  or  a  slight  modification  of  the  algorithm,  are 
necessary. 


9*  An  ALGOL  W  procedure  and  test  program 

The  procedure  praxis,  plus  a  driver  program  and  test  functions, 
is  given  below.  The  language  is  ALGOL  W  (Wirth  and  Hoare  (19 66), 
Bauer,  Becker  and  Graham  (1968)),  but  none  of  the  special  features 
of  ALGOL  W  have  been  used,  so  translation  into  another  dialect  of 
ALGOL  should  be  straightforward. 
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BEGIN  COMMENT 


TEST  PROGRAM  FOR  PROCEDURE  PRAXIS. 

A  A  A  A  * 

/ 


LONG  REAL  PROCEDURE  PRAXIS  (LONG  REAL  VALUE  T,  MACHEPS ,  H; 
INTEGER  VALUE  N,  PRIN; 

LONG  REAL  ARRAY  X(*);  LONG  REAL  PROCEDURE  F,  RANDOM); 

BEGIN  COMMENT: 

THIS  PROCEDURE  MINIMIZES  THE  FUNCTION  F(X,  N)  OF  N 
VARIABLES  X(l),  ...  X(N),  USING  THE  PRINCIPAL  AXIS  METHOD. 

ON  ENTRY  X  HOLDS  A  GUESS,  ON  RETURN  IT  HOLDS  THE  ESTIMATED 
POINT  OF  MINIMUM,  WITH  (HOPEFULLY)  | ERROR |  < 

SQRT(MACHEPS)*  |  X|  ♦  T,  WHERE  MACHEPS  IS  THE  MACHINE 
PRECISION,  THE  SMALLEST  NUMBER  SUCH  THAT  1  +  MACHEPS  >  1, 

T  IS  A  TOLERANCE,  AND  |.|  IS  THE  2-NORM.  H  IS  THE  MAXIMUM 
STEP  SIZE:  SET  VO  ABOUT  THE  MAXIMUM  EXPECTED  DISTANCE  FROM 
THE  GUESS  TO  THE  MINIMUM  (IF  H  IS  SET  TOO  SMALL  OR  TOO 
LARGE  THEN  THE  INITIAL  RATE  OF  CONVERGENCE  WILL  BE  SLOW). 

THE  USER  SHOULD  OBSERVE  THE  COMMENT  ON  HEURISTIC  NUMBERS 
AFTER  PROCEDURE  QUAD. 

PRIN  CONTROLS  THE  PRINTING  OF  INTERMEDIATE  RESULTS. 

IF  PRIN  -  0,  NO  RESULTS  ARE  PRINTED. 

IF  PKIN  -  1,  F  IS  PRINTED  AFTER  EVERY  N+l  OR  N+2  LINEAR 
MINIMIZATIONS,  AND  FINAL  X  IS  PRINTED,  BUT  INTERMEDIATE 
X  ONLY  IF  N  <«  4. 

IF  PRIN  -  2,  EIGENVALUES  OF  A  AND  SCALE  FACTORS  ARE  ALSO 
PRINTED. 

IF  PRIN  -  3,  F  AND  X  ARE  PRINTED  AFTER  EVERY  FEW  LINEAR 
MINIMIZATIONS. 

IF  PRIN  *  4,  EIGENVECTORS  ARE  ALSO  PRINTED. 

FMIN  IS  A  GLOBAL  VARIABLE:  SEE  PROCEDURE  PRINT. 

RANDOM  IS  A  PARAMETERLESS  LONG  REAL  PROCEDURE  WHICH  RETURNS 
A  RANDOM  NUMBER  UNIFORMLY  DISTRIBUTED  IN  (0,  1).  ANY 
INITIALIZATION  MUST  BE  DONE  BEFORE  THE  CALL  TO  PRAXIS. 

THE  PROCEDURE  IS  MACH  I NE- 1 NDEPENDENT,  APART  FROM  THE  OUTPUT 
STATEMENTS  AND  THE  SPECIFICATION  OF  MACHEPS.  WE  ASSUME  THAT 
MACHEPS**(-4)  DOES  NOT  OVERFLOW  (IF  IT  DOES  THEN  MACHEPS  MUST 
BE  INCREASED),  AND  THAT  ON  FLOATING-POINT  UNDERFLOW  THE 
RESULT  IS  SET  TO  ZERO; 

PROCEDURE  MINFIT  (INTEGER  VALUE  N;  LONG  REAL  VALUE  EPS,  TOL; 
LONG  REAL  ARRAY  ABC*,*);  LONG  REAL  ARRAY  Q( * ) ) ; 

BEGIN  COMMENT:  AN  IMPROVED  VERSION  OF  MINFIT,  SEE  GOLUB  & 

REINSCH  (1969),  RESTRICTED  TO  M  -  N,  P  *  0. 
THE  SINGULAR  VALUES  OF  THE  ARRAY  AB  ARE 
RETURNED  IN  Q,  AND  AB  IS  OVERWRITTEN  WITH 
THE  ORTHOGONAL  MATRIX  V  SUCH  THAT 
U.DIAG(Q)  *  AB.V, 

WHERE  U  IS  ANOTHER  ORTHOGONAL  MATRIX; 

INTEGER  L,  KT; 
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LONG  REAL  C,  F,G,H,S,X,  Y,Z; 

LONG  REAL  ARRAY  E(1::N); 

COGENT:  HOUSEHOLDER'S  REDUCTION  TO  BID  I  AGONAL  FORM; 

G  ; •  X  ; a  0; 

FOR  I  :«  1  UNTIL  N  DO 

BEGIN 

E( I )  : *  G;  S  ; a  0;  L  : ■  I+1; 

FOR  J  :=  I  UNTIL  N  DO  S  :a  S*AB(J; I )**2; 

IF  S<TOL  THEN  G  0  ELSE 
BEGIN 

F  ;«  AB( I /  I  j ;  G  :a  IF  F<0  THEN  LONGSQRT(S) 

ELSE  -LONGSQRT(S); 

H  F*G-S;  AB( I / I )  F-G; 

FOR  J  ;*  L  UNTIL  N  DO 
BEGIN  F  :»  0; 

FOR  K  :»  I  UNTIL  N  DO  F  ;»  F  +  AB(K, l )*AB(K, J) ; 

F  F/H; 

FOR  K  :»  I  UNTIL  N  DO  AB(K,J)  :a  AB(K,J)  ♦  F*AB(K,I) 
END  J 
END  S; 

Q(  I  )  ; ■  G;  S  :*  0; 

IF  l<-N  THEN  FOR  J  :«  L  UNTIL  N  DO 
S  : *  S  ♦  AB(I,J)**2; 

IF  S<TOL  THEN  G  0  ELSE 
BEGIN 

F  A6( I ,  I  +1) ;  G  IF  F<0  THEN  LONGSQRT(S) 

ELSE  -LONGSQRT(S); 

H  : «  F*G-S;  AB(I,I*1)  :«  F-G; 

FOR  J  :»  L  UNTIL  N  DO  E(J)  AB(I,J)/H; 

FOR  J  L  UNTIL  N  DO 
BEGIN  S  :«  0; 

FOR  K  L  UNTIL  N  DO  S  :«  S  ♦  AB( J, JO *AB( I ,  K) ; 

FOR  K  L  UNTIL  N  DO  AB(J,K>  :a  AB(J,K)  ♦  S*E(K) 

END  J 
END  S; 

Y  ABS(Q( I ))  ♦  ABS ( E ( I  ) ) ;  IF  Y  >X  THEN  X  :»  Y 
END  I ; 

COMMENT:  ACCUMULATION  OF  RIGHT-HAND  TRANSFORMAT  I  ON'S; 

FOR  I  N  STEP  -1  UNTIL  1  DO 
BEGIN 

IF  G”,«0  THEN 
BEGIN 

H  AB( I , I ♦1)*G; 

FOR  J  L  UNTIL  N  DO  AB(J,I>  :=  AB(I/J)/H; 

FOR  J  L  UNTIL  N  DO 

BEGIN  S  0; 

FOR  K  L  UNTIL  N  DO  S  :«  S  ♦  AB( I , K) *AB( K, J ) ; 

FOR  K  L  UNTIL  N  DO  AB(K,J)  :»  AB(K,J)  ♦  S*AB(K,I) 

END  J 
END  G; 
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FOR  J  :»  L  UNTIL  N  DO  ABd,J)  :  =  AB ( J, I  )  0; 

AB (1,1)  :=  1;  G  :=  E(l);  L  :=  I 
END  I; 

COMMENT:  D I  AGONAL  I ZATI ON  OF  THE  B I  DIAGONAL  FORM; 

EPS  :«  EPS*X; 

FOR  K  N  STEP  -1  UNTIL  1  DO 
BEGIN  KT  :«  0; 

TESTFSPLITTING: 

KT  :*  KT  ♦  1;  I F  KT  >  30  THEN 
BEGIN  E ( K)  :«  OL; 

WRITE  ("QR  FAILED") 

END; 

FOR  L2  :*  K  STEP  -1  UNTIL  1  DO 
BEGIN 
L  :■  L2; 

IF  ABS(E(L)X«EPS  THEN  GOTO  TESTFCONVERGENCE ; 

IF  ABS(Q(L-1)X«EPS  THEN  GOTO  CANCELLATION 
END  L2; 

COMMENT:  CANCELLATION  OF  E(L)  IF  L>1; 

CANCELLATION: 

C  :■  0;  S  :-  1; 

FOR  I  :«  L  UNTIL  K  DO 
BEGIN 

F  :*  S*E( I  ) ;  E(l)  :-  C*Ed); 

IF  ABS(FX«EPS  THEN  GOTO  TESTFCONVERGENCE; 

G  :•  Q( I ) ;  Q(  I  )  :«  H  :-  IF  ABS(F)  <  ABS(G)  THEN 
ABS(G)*LONGSQRT(l  ♦  (F/G)**2)  ELSE  IF  F  -■  0  THEN 
ABS(F)*LONGSQRT(l  ♦  (G/F)**2)  ELSE  0; 

IF  H  -  0  THEN  G  :*  H  :■  1; 

COMMENT:  THE  ABOVE  REPLACES  Q( I ) : -H:-LONGSQRT(G*G+F*F) 
WHICH  MAY  GIVE  INCORRECT  RESULTS  IF  THE 
SQUARES  UNDERFLOW  OR  IF  F  •  G  *  0; 

C  :»  G/H;  S  :«  -F/H 
END  I; 

TESTFCONVERGENCE: 

Z  :-  Q(K);  IF  L-K  THEN  GOTO  CONVERGENCE; 

COMMENT:  SHIFT  FROM  BOTTOM  2*2  MINOR; 

X  :■  Q(L);  Y  :-  Q(K-l);  G  :-  E(K-l);  H  :»  EC K) ; 

F  :-  ( (Y-Z) *( Y*Z)  ♦  (G-H) *(G+H) )/ ( 2*H*Y) ; 

G  :-  LONGSQRT(F*F+l) ; 

F  :«  ((X-Z)*(X*Z)+H*(Y/(IF  F<0  THEN  F-G  ELSE  F+G)-H))/X; 

COMMENT:  NEXT  QR  TRANSFORMATION; 

C  *  ■  S  j  *  1  • 

FOR  I  :«  L*1  UNTIL  K  DO 
BEGIN 

G  :-  Ed);  Y  :■  Q(l);  H  :-  S*G;  G  :-  G*C; 
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E(l-l)  :»  Z  :=  IF  ABS(F)  <  ABS(H)  THEN 
ABS(H)*LONGSQRT ( 1  ♦  (F/H)**2)  ELSE  I F  F  -■  0  THEN 
ABS(F)*LONGSQRT( 1  ♦  (H/F)**2)  ELSE  0; 

IF  Z  «  0  THEN  Z  : *  F  1; 

C  :=  F/Z;  S  H/Z; 

F  :*  X*C  ♦  G*S;  G  :«  -X*S  +G*C;  H  :=  Y*S; 

Y  :»  Y*C; 

FOR  J  :=  1  UNTIL  N  DO 
BEGIN 

X  :=  AB(  J, 1-1);  Z  :=  AB(J, I); 

AB ( J/ I -1)  :=  X*C  ♦  Z*S ;  AB(J,I)  -X*S  ♦  Z*C 
END  J; 

Q(l-l)  Z  :=  IF  ABS(F)  <  ABS(H)  THEN  ABS(H)* 
LONGSQRT(l  ♦  (F/H)**2)  ELSE  I F  F  -■  0  THEN 
ABS(F)*LONGSQRT(l  ♦  (H/F)**2)  ELSE  0; 


IF  Z 

=  0 

THEN  Z  :»  F  :»  1; 

C  :  * 

F/Z; 

!  S  :«  H/Z; 

F  :- 

C*G 

♦  S*Y;  X  :«  -S*G  ♦  C*Y 

END  1 

• 

/ 

L) 

0; 

E(K)  :-  F;  Q(K)  :=  X; 

30  TO  TESTFSPLITTING; 

CONVERGENCE: 

IF  Z<0  THEN 

BEGIN  COMMENT:  O'K)  IS  MADE  NON-NEG; 

Q( K )  : =  -z  • 

FOR  J '  1#UNT I L  N  DO  AB(J,K)  -AB( J/ K) 

END  Z 
END  K 

END  MINFIT; 

PROCEDURE  SORT* 

BEGIN  COMMENT:  SORTS  THE  ELEMENTS  OF  D  AND  CORRESPONDING 

COLUMNS  OF  V  INTO  DESCENDING  ORDER; 

INTEGER  K; 

LONG  REAL  S; 

FOR  I  1  UNTIL  N  -  1  DO 

BEGIN  K  I;  S  :-  D(l);  FOR  J  :»  I  ♦  1  UNTIL  N  DO 
IF  D( J )  >  S  THEN 

BEGIN  K  J;  S  :-  D(J)  END; 

IF  K  >  I  THEN 

BEGIN  D(K)  :*  D(l);  D(l)  :■  S;  FOR  J  :»  1  UNTIL  N  DO 
BEGIN  S  :«  V(J,I);  V(J,I)  :-  V(J#K);  V(J,K)  S 
END 
END 
END 

END  SORT; 

PROCEDURE  PRINT; 

COMMENT:  THE  VARIABLE  FMIN  IS  GLOBAL,  AND  ESTIMATES  THE 
VALUE  OF  F  AT  THE  MINIMUM:  USED  ONLY  FOR 
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PRINTING  LOG( FX  -  FMIN); 

IF  PRIN  >  0  THEN 

BEGIN  INTEGER  SVINT;  SVINT  :=  I NTF I  ELDS  I  ZE; 

INTFI  ELDSIZE  :=  10; 

WRITE  (NL,  NF,  FX); 

COMMENT:  IF  THE  NEXT  TWO  LINES  ARE  OMITTED  THEN  FMIN  IS 

NOT  REQUIRED; 

IF  FX  <«  FMIN  THEN  WRITEON  ("  UNDEFINED  ")  ELSE 
WRITEON  (ROUNDTOREAL  ( LONGLOG  (FX  -  FMIN))); 

COMMENT:  "iOCQNTROL(2)M  MOVES  TO  THE  NEXT  LINE; 

IF  N  >  4  THEN  I OCONTROLC 2) ; 

IF  (N  <«  4)  OR  (PRIN  >  2)  THEN 

FOR  I  :«  1  UNTIL  N  DO  WR I TEON ( ROUNDTOREAL( X( I ) ) ) ; 

I OCONTROL( 2) ;  I  NTF I ELDSIZE  :»  SV I  NT 
END  PRINT; 

PROCEDURE  MATPRINT  (STRING(80)  VALUE  S;  LONG  REAL  ARRAY 
V(*,*);  INTEGER  VALUE  M,  N) ; 

BEGIN  COMMENT:  PRINTS  M  X  N  MATRIX  V  COLUMN  BY  COLUMN; 

WRI TE  (S ) * 

FOR  K  1  UNTIL  (N  ♦  7)  DIV  8  DO 
BEGIN  FOR  I  :«  1  UNTIL  M  DO 
BEGIN  IOCONTROL( 2) ; 

FOR  J  :-  8*K  -  7  UNTIL  (IF  N  <  (8*10  THEN  N  ELSE  8*K) 

DO  WRITEON  (ROUNDTOREAL  (V  (l,J))) 

END; 

WRITE  ("  ");  I OCONTROL( 2 ) 

END 

END  MATPRINT; 

PROCEDURE  VECPRI NT  (STRING(32)  VALUE  S;  LONG  REAL  ARRAY  V(*) 
INTEGER  VALUE  N); 

BEGIN  COMMENT:  PRINTS  THE  HEADING  S  AND  N-VECTOR  V; 
WRITE(S); 

FOR  I  1  UNTIL  N  DO  WR I TEOM( ROUNDTOREAL( V(  I  ) ) ) 

END  VECPRI NT; 

PROCEDURE  MIN  (INTEGER  VALUE  J,  NITS;  LONG  REAL  VALUE 
RESULT  D2,  XI;  LONG  PEAL  VALUE  FI;  BOOLEAN  VALUE  FK); 
BEGIN  COMMENT: 

MINIMIZES  F  FROM  X  IN  THE  DIRECTION  V(*,J) 
UNLESS  J<1,  WHEN  A  QUADRATIC  SEARCH  IS  DONE 
IN  THE  PLANE  DEFINED  BY  QO,  Q1  AND  X. 

D2  AN  APPROXIMATION  TO  HALF  F"  (OR  ZERO) , 

XI  AN  ESTIMATE  OF  DISTANCE  TO  MINIMUM, 
RETURNED  AS  THE  DISTANCE  FOUND. 

IF  FK  -  TRUE  THEN  FI  IS  FLIN(Xl),  OTHERWISE 
XI  AND  FI  ARE  IGNORED  ON  ENTRY  UNLESS  FINAL 
FX  >  FI.  NITS  CONTROLS  THE  NUMBER  OF  TIMES 
AN  ATTEMPT  IS  MADE  TO  HALVE  THE  INTERVAL. 

SIDE  EFFECTS:  USES  AND  ALTERS  X,  FX,  NF,  NL. 


IF  J  <  1  USES  VARIABLES  Q...  . 

USES  H,  N,  T,  M2,  M4,  LOT,  OMIN,  MACHEPS; 

LONG  REAL  PROCEDURE  FLIN  (LONG  RE.,L  VALUE  L); 

COMMENT:  THE  FUNCTION  OF  ONE  VARIABLE  L  WHICH  IS 
MINIMIZED  BY  PROCEDURE  MIN; 

BEGIN  LONG  REAL  ARRAY  T(1::N); 

IF  J  >  0  THEN 

BEGIN  COMMENT:  LINEAR  SEARCH; 

FOR  I  :»  1  UNTIL  N  DO  TO)  :«  X  C I  >  +  L*V(I,J) 

END 

ELSE 

BEGIN  COMMENT:  SEARCH  ALONG  A  PARABOLIC  SPACE-CURVE; 

QA  :*  L*(L  -  QDX)/(QD0*(QD0  ♦  QD1) ); 

QB  :=  (L  ♦  QD0)*(QD1  -  L)/ (QD0*QD1 ) ; 

QC  :»  L*( L  +  QDO)/ (QD1*(QD0  ♦  QD1) ) ; 

FOR  I  :  *  1  UNTIL  N  DO  TO)  :*  QA*Q0( 1 )+QB*X(l  )+Q>Ql(  I  ) 
END; 

COMMENT:  INCREMENT  FUNCTION  EVALUATION  COUNTER; 

NF  :-  NF  ♦  1; 

F(T,  N) 

END  FLIN; 

INTEGER  K;  BOOLEAN  DZ; 

LONG  REAL  X2,  XM,  FO,  F2,  FM,  Dl,  T2,  S,  SF1,  SX1; 

SF1  : *  FI*  SX1  :*  xi* 

K  : •* 0;  XM  0;  FO*:-  FM  :«  FX;  DZ  :»  (D2  <  MACHEPS); 

COMMENT:  FIND  STEP  SIZE; 

S  0;  FOR  I  :«  1  UNTIL  N  DO  S  :=  S  ♦  X(l)**2; 

S  :«  LONGSQRT(S) ; 

T2 : *  M4*LONGSQRT(ABS(FX)/(IF  DZ  THEN  DMIN  ELSE  D2) 

♦  S*LDT)  ♦  M2*LDT; 

S  :«  M4*S  ♦  T; 

IF  DZ  AND  (T2  >  S)  THEN  T2  :-  S; 

IF  T2  <  SMALL  THEN  T2  :-  SMALL; 

IF  T2  >  ( 0. 01*H)  THEN  T2  0.01*H; 

IF  FK  AND  (FI  <-  FM)  THEN  BEGIN  XM  :=  XI;  FM  :=  FI  END; 

IF  "*FK  OR  (ABS(Xl)  <  T2)  THEN 

BEGIN  XI  :-  IF  XI  >-  OL  THEN  T2  ELSE  -T2; 

FI  FLIN(Xl) 

END; 

IF  FI  <»  FM  THEN  BEGIN  XM  :*  XI;  FM  :=  FI  END; 

LO:  IF  DZ  THEN 

BEGIN  COMMENT:  EVALUATE  FLIN  AT  ANOTHER  POINT  AND 

ESTIMATE  THE  SECOND  DERIVATIVE; 

X2  IF  FO  <  FI  THEN  -XI  ELSE  2*X1;  F2  :-  FLINCX2) ; 

IF  F2  <«  FM  THEN  BEGIN  XM  :-  X2;  FM  :«  F2  END; 

D2  :»  (X2*(F1  -  FO)  -  X1*(F2  -  FO))/ (X1*X2*(X1  -  X2)) 

END; 

COMMEMT :  ESTIMATE  FIRST  DERIVATIVE  AT  0; 

Dl  :»  (FI  -  F0)/X1  -  X1*D2;  DZ  :-  TRUE; 


COMMENT:  PREDICT  MINIMUM; 

X2  :«  IF  D2  <»  SMALL  THEN  (IF  D1  <  0  THEN  H  ELSE  -H)  ELSE 
-0. 5L*D1/D2; 

IF  ABS(X2)  >  H  THEN  X2  :=  IF  X2  >  0  THEN  H  ELSE  -H; 

COMMENT:  EVALUATE  F  AT  THE  PREDICTED  MINIMUM; 

LI:  F2  :=  FL!N(X2); 

IF  ( K  <  NITS)  AND  (F2  >  FO)  THEN 

BEGIN  COMMENT:  NO  SUCCESS  SO  TRY  AGAIN;  K  :=  K  +  1; 

IF  (FO  <  FI)  AND  ((Xi*X2)  >  0)  THEN  GO  TO  LO; 

X2  0 . 5L*X2;  GO  TO  LI 
END; 

COMMENT:  INCREMENT  ONE-DIMENSIONAL  SEARCH  COUNTER; 

NL  :»  NL  ♦  1: 

IF  F 2  >  FM  THEN  X2  :=  XM  ELSE  FM  :=  F2; 

COMMENT:  GET  NEW  ESTIMATE  OF  SECOND  DERIVATIVE; 

D2  :«  IF  ABS(X2*(X2  -  XD)  >  SMALL  THEN 

(X2*(F1  -  FO)  -  XI * ( FM  -  FO) )/ ( X1*X2* ( XI  -  X2 ) ) 

ELSE  IF  K  >  0  THEN  0  ELSE  D2; 

IF  D2  <-  SMALL  THEN  D2  : =  SMALL; 

XI  X2 ;  FX  :«  FM; 

IF  SF1  <  FX  THEN  BEGIN  FX  :*  SF1;  XI  :*  SX1  END; 

COMMENT:  UPDATE  X  FOR  LINEAR  SEARCH  BUT  NOT  FOR  PARABOLIC 
PARABOLIC  SEARCH; 

IF  J  >  0  THEN  FOR  I  :=  1  UNTIL  N  DO  X(l)  :=  X(l)  ♦  X1*V(I,J) 
END  MIN; 

PROCEDURE  QUAD; 

BEGIN  COMMENT:  LOOKS  FOR  THE  MINIMUM  ALONG  A  CURVE 

DEFINED  BY  QO,  Q1  AND  X; 

LONG  REAL  L ,  S; 

S  :«  FX;  FX  :«  QF1;  QF1  :-  S;  QD1  :*  0; 

FOR  I  :•  1  UNTIL  N  DO 

BEGIN  S  :»  X(l);  X(l>  :»  L  :-  Ql(l);  Ql(l)  :=  S; 

QD1  QD1  ♦  (S  -  L)**2 
END; 

L  QD1  LONGSQRT(QDl) ;  S  :*  0; 

IF  (QDO  >  0)  AND  (QD1  >  0)  AND  (NL  >«  f3*N*N))  THEN 
BEGIN  MIN  (0,  2,  S ,  L,  QF1,  TRUE); 

QA  L*( L  -  QD1)/(QD0*(QD0  ♦  QD1)); 

QB  :«  (L  ♦  QDO) *(QD1  -  L)/(QD0*QD1); 

QC  L*( L  ♦  QDO)/ (QD1* (QDC  ♦  QD1) ) 

END 

ELSE  BEGIN  FX  QF1;  QA  QB  :=  0;  QC  :«  1  END; 

QDO  QD1;  FOR  I  :-  1  UNTIL  N  DO 
BEGIN  S  :■  Q0(l );  Q0(l  )  X(l); 

X(l)  :-  QA*S  +  QB*X( I )  ♦  QC*Q1(I) 

END 

END  QUAD; 

BOOLEAN  I LLC; 

INTEGER  NL/  NF,  KL,  Kl,  KTM; 
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LONG  REAL  S,  SL,  DN,  DM I N,  FX,  FI,  LDS,  LDT,  SF,  DF, 
QF1,  QDO,  QD1,  QA,  QB,  QC, 

M2,  M4,  SMALL,  VSMALL,  LARGE,  VLARGE,  SCBD,  LDFAC,  T2; 
LONG  REAL  ARRAY  D,  Y,  Z,  QO,  Q1  (1::N); 

LONG  REAL  ARRAY  V  (ls:N,  1 : : N) ; 

COMMENT:  INITIALIZATION; 

COMMENT:  MACHINE  DEPENDENT  NUMBERS; 

SMALL  :«  MACHEPS**2;  VSMALL  SMALL**2; 

LARGE  :»  1L/ SMALL;  VLARGE  :»  1L/ VSMALL; 

M2  :-  LONGSQRT(MACHEPS) ;  M4  LONGSQRT(M2) ; 

COMMENT:  HEURISTIC  NUMBERS 


IF  AXES  MAY  BE  BADLY  SCALED  (WHICH  IS  TO  BE  AVOIDED  IF 
POSSIBLE;  THEN  SET  SCBD  :-  10,  OTHERWISE  1. 

IF  THE  PROBLEM  IS  KNOWN  TO  BE  I LLCOND I T I ONED  SET 
I LLC  :«  TRUE,  OTHERWISE  FALSE. 

KTM+1  IS  THE  NUMBER  OF  ITERATIONS  WITHOUT  IMPROVEMENT  BEFORE 
THE  ALGORITHM  TERMINATES  (SEE  SECTION  6).  KTM  -  4  IS  VERY 
CAUTIOUS:  USUALLY  KTM  «  1  IS  SATISFACTORY; 

SCBD  :-  1;  I LLC  :«  FALSE;  KTM  :«  1; 

LDFAC  :«  IF  ILLC  THEN  0.1  ELSE  0.01; 

KT  :«  NL  :-  0;  NF  :«  1;  QF1  :*  FX  :-  F(X,N); 

T  :*  T2  :-  SMALL  ♦  ABS(T);  DMIN  :-  SMALL; 

IF  H  <  ( 100*T)  THEN  H  :«  100*T;  LOT  :-  H; 

FOR  I  :-  1  UNTIL  N  DO  FOR  J  :-  1  UNTIL  N  DO 
V(I,J)  : -  IF  I  -  J  THEN  1L  ELSE  OL; 

D ( 1 )  :-  QDO  0;  FOR  I  :-  1  UNTIL  N  DO  Ql( I )  X(l); 

PRINT; 

COMMENT:  MAIN  LOOP; 

LO:  SF  s-  D( 1) ;  D(l)  :-  S  :-  0; 

COMMENT :  MINIMIZE  ALONG  FIRST  DIRECTION; 

MIN  (1,  2,  D( 1) ,  S,  FX,  FALSE); 

IF  S  <«  0  THEN  FOR  I  :«  1  UNTIL  N  DO  V(l,l)  :*  -V( 1,1); 

IF  (SF  <»  ( 0. 9*D( 1) ) )  OR  ( ( 0 . 9*SF)  >«  D( 1 ) )  THEN 

FOR  I  :-  2  UNTIL  N  DO  D( I )  :=  0; 

FOR  K  :-  2  UNTIL  N  DC 

BEGIN  FOR  I  :»  1  UNTIL  N  DO  Y ( I )  :•  X(lJ;  SF  FX; 

ILLC  :-  ILLC  OR  (KT  >  0); 

LI:  KL  :-  K;  DF  :»  0;  IF  ILLC  THEN 

BEGIN  COMMENT:  RANDOM  STEP  TO  GET  OFF  RESOLUTION  VALLEY; 
FOR  I  :-  1  UNTIL  N  DO 

BEGIN  S  :-  Z(l)  :-  (0.1*LDT  ♦  T2*10**KT)*(RANDOM-0.5L) 
COMMENT:  PRAXIS  ASSUMES  THAT  RANDOM  RETURNS  A  RANDOM 
NUMBER  UNIFORMLY  DISTRIBUTED  IN  (0,  1)  AND 
THAT  ANY  INITIALIZATION  OF  THE  RANDOM  NUMBER 


GENERATOR  HAS  A,LR£AL>Y  &t  £N  DONE; 
f<?«  si  :*  l  M«TI  L  N  W  /<J)  :*  /<sO  ♦  S*Y<J,D 

f/  i/fCA,  «);  NF  :*  Nf  *  1 

END; 

FOR  *2  ;#  fc  UNTIL  N  00 
tWH  $L  «*  fA;  $  0; 

COMMENT;  MINIMIZE  Ai-OMO  "MON -CONJUGATE"  DIRECTIONS; 

MIN  <*?,  l,  DW),  $,  FA,  fAL$£); 

5  i*  IF  ILLC  THEN  Q(K2)*($  *  UW))**l  ECSE  5L  -  FX; 

IF  Df  <  S  THEN 
BEGIN  OF  :*  $;  KL  : *  K2 
END 
FWO; 

I F  “*|UC  AND  (OF  <  ABSUG<^MACHEP$*FX))  THEN 
NEGIN  COMMENT:  NO  SUCCESS  ILLC  *  FALSE  SO  TRY  ONCE 

WITH  ILLC  *  TRUE; 

ILLC  ;*  TRUE;  00  TO  U 
END; 

IF  (K  *  2)  ANQ  <FR|N  >  J)  THEN  YECP&INT  ("NEW  D",  0/  N); 

FOB  M  f*  l  UNTIL  K  -  E  00 

BEGIN  COMMENT;  MINIMIZE  ALONG  "CONJUGATE"  DIRECTIONS; 

S  t»  Qi  MIN  m,  2,  Um)t  5,  FX,  FALSE) 

mi 

FT  i»  FX;  FX  i»  SF;  LOS  ;•  0; 

FOB  I  I  *  E  UNTIL  N  00 

NEOIN  5L  ;•  A(l);  AO)  ;»  VO);  5L  ;  =  TO)  f»  SL  -  Y(l); 

LOS  l*  LOS  ♦  $L*$L 

ENO; 

LOS  *  *  LONOSQBT(LOS);  IF  LOS  >  SMALL  THEN 
NEOIN  COMMENT;  THROW  AWAY  OIRECTION  KL  AND  MINIMIZE 

ALONO  THE  NEW  "CONJUGATE"  DIRECTION; 

FOB  I  ;•  KL  -  E  STEP  -E  UNTIL  K  00 
NEOIN  FOR  J  «*  1  UNTIL  N  00  V(J,I  ♦  i)  i»  V(J,I); 

0 d  t  L)  j»  P(|) 

ENP; 

OIK)  j*  Of  FOR  I  ?•  l  UNTIL  N  DO  V(I,K)  i»  Y(I)/LDS; 

MIN  <K,  4,  0(A)#  LOS,  FT,  TRUE); 

IF  LOS  <•  Q  THEN 
BEGIN  LDS  i *  -LOS; 

FOR  I  I 9  1  UNTIL  N  DO  V(I,K)  i*  -Vd,K> 

END 

ENO; 

LOT  |«  LPFACHPT;  IF  LOT  <  LOS  THEN  LDT  I-  LDS; 

PRINT; 

T2  l«  Q;  FOR  I  J»  \  UNTIL  N  DO  T2  I*  T2  ♦  X(l)**2; 

T2  ♦  *  M2*LONOSORT(T2)  t  T; 

COMMENT  J  SEE  IF  STEP  LENGTH  EXCEEDS  HALF  THE  TOLERANCE; 

KT  »»  IF  LOT  >  ( Q, &«T?)  THEN  Q  ELSE  KT  *  i; 

IF  KT  >  ATM  THEN  GO  TO  L2 
END; 


d 


COMMENT:  TRY  QUADRATIC  EXTRAPOLATION  IN  CASE  WE  ARE  STUCK 
IN  A  CURVED  VALLEY; 

QUAD; 

DN  0;  FOR  I  :=  1  UNTIL  N  DO 
BEGIN  D( I )  :«  1/ LONGSQRT(D( I ) ) ; 

IF  DN  <  DC  I )  THEN  DN  :«  DC  I > 

END; 

IF  PR  IN  >  3  THEN  MATPRINT  ("NEW  Dl RECTI ONS",  V,  N,  N); 

FOR  J  1  UNTIL  N  DO 
BEGIN  S  D(J)/DN; 

FOR  I  :«  1  UNTIL  N  DO  V(I,J)  :*  S*V(I,J) 

END; 

IF  SCBD  >  1  THEN 

BEGIN  COMMENT:  SCALE  AXES  TO  TRY  TO  REDUCE  CONDITION 

NUMBER; 

S  :-  VLARGE;  FOR  I  :*  1  UNTIL  N  DO 

BEGIN  SL  :»  0;  FOR  J  :«  1  UNTIL  N  DO  SL  :»  SL+V( I ,  J ) **2; 
Z( I  )  :-  LONGSQRT(SL) ; 

IF  Z(l)  <  M4  THEN  Z(l)  :=  M4;  IF  S  >  Z(l)  THEN  S  :=  Z(l) 
END; 

FOR  I  :«  1  UNTIL  N  DO 

BEGIN  SL  :=  S/ZC I  ) ;  Z(l)  :-  1/SL;  IF  Z(l)  >  SCBD  THEN 
BEGIN  SL  :»  1/SCBD;  Z(l)  SCBD 
END; 

FOR  J  :«  1  UNTIL  N  DO  V(I,J)  :-  SL*V(I,J) 

END 

END; 

COMMENT:  TRANSPOSE  V  FOR  M INF  IT; 

FOR  I  :■  2  UNTIL  N  DO  FOR  J  :-  1  UNTIL  I  -  1  DO 

BEGIN  S  :-  V(I,J);  V(I,J)  :-  V(J,I);  V(J,I)  :■  S  END; 
COMMENT:  FIND  THE  SINGULAR  VALUE  DECOMPOSITION  OF  V.-  THIS 
GIVES  THE  EIGENVALUES  AND  PRINCIPAL  AXES  OF  THE 
APPROXIMATING  QUADRATIC  FORM  WITHOUT  SQUARING  THE 
CONDITION  NUMBER; 

MINFIT  (N,  MACHEPS,  VSMALL,  V,  D); 

IF  SCBD  >  1  THEN 

BEGIN  COMMENT:  UNSCALING;  FOR  I  :=  1  UNTIL  N  DO 
BEGIN  S  :«  Z C I ) ; 

FOR  J  :-  1  UNTIL  N  DO  V(I#J)  :-  S*V(I,J) 

END; 

FOR  I  :-  1  UNTIL  N  DO 

BEGIN  S  :*  0;  FOR  J  :»  1  UNTIL  N  DO  S  : -  S  +  V(J,I)**2; 

S  :-  LONGSQRT(S) ;  DC  I >  :«  S*D(I);  S  :-  1/S; 

FOR  J  :«  1  UNTIL  N  DO  V(J,I)  :-  S*V( J, I  ) 

END 

END; 

FOR  I  :  ■  1  UNTIL  N  DO 

BEGIN  DC  I )  :-  IF  (DN*D(  I  ) )  >  LARGE  THEN  VSMALL  ELSE 
IF  (DN*D(  I ) )  <  SMALL  THEN  VLARGE  ELSE  (DN*D( I ) )**(-2) 

END; 

COMMENT:  SORT  NEW  EIGENVALUES  AND  EIGENVECTORS; 
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SORT; 

DMIN  D(N) ;  IF  DMIN  <  SMALL  THEN  DMIN  :»  SMALL; 

I LLC  :=  (M2*D( 1) )  >  DMI  N; 

IF  (PRIN  >  1)  AND  (SCBD  >  1)  THEN 
VECPRI NT  ("SCALE  FACTORS",  Z,  N); 

IF  PRIN  >  1  THEN  VECPRINT  ( "E I GENVAI UES  OF  A",  D,  N); 

IF  PRIN  >  3  THEN  MATPRINT  ("EIGENVECTORS  OF  A",  V,  N,  N); 
COMMENT:  GO  BACK  TO  MAIN  LOOP; 

GO  TO  LO; 

L2 :  IF  PRIN  >  0  THEN  VECPRINT  ("X  IS",  X,  N); 

FX 

END  PRAXIS; 

COMMENT:  RANDOM  NUMBER  GENERATOR 


PROCEDURE  RANDOM  RETURNS  A  LONG  REAL  RANDOM  NUMBER  UNIFORMLY 
DISTRIBUTED  IN  (0,1)  (INCLUDING  0  BUT  NOT  1). 

RANINIT(R)  WITH  R  ANY  INTEGER  MUST  BE  CALLED  FOR 
INITIALIZATION  BEFORE  THE  FIRST  CALL  TO  RANDOM,  AND  THE 
DECLARATIONS  OF  RANI,  RAN2  AND  RAN3  MUST  BE  GLOBAL. 

THE  ALGORITHM  RETURNS  X(N)/2**56,  WHERE 

X(N)  -  X(N-l)  +  X(N-127)  (MOD  2**56). 

SINCE  1  +  X  ♦  X**127  IS  PRIMITIVE  (MOD  2),  THE  PERIOD  IS  AT 
LEAST  2**127  -  3  >  10**38.  SEE  KNUTH  (1969),  PP.  26,  34,  464. 

X(N)  IS  STORED  IN  A  LONG  REAL  WORD  AS 
RAN3  -  X(N)/2**56  -  1/2,  AND  ALL  FLOATING  POINT  ARITHMETIC 
IS  EXACT; 

LONG  REAL  RANI;  INTEGER  RAN 2;  LONG  REAL  ARRAY  RAN3  ( 0 : : 126 ) ; 

PROCEDURE  RANINIT  (INTEGER  VALUE  R); 

BEGIN  R  :«  ABS(R)  REM  8190  ♦  1; 

RAN 2  :«  127;  WHILE  RAN2  >  0  DO 

BEGIN  RAN2  :-  RAN 2  -  1;  RANI  :«  -2L**55; 

FOR  I  :-  1  UNTIL  7  DO 

BEGIN  R  :-  (1756*R)  REM  8191; 

RANI  :-  (RANI  ♦  (R  DIV  32 ))*( 1/256) ; 

END; 

RAN 3  (RAN2)  :-  RANI 
END 

END  RANINIT; 

LONG  REAL  PROCEDURE  RANDOM; 

BEGIN  RAN2  :«  IF  RAN 2  *  0  THEN  126  ELSE  RAN2  -  1; 

RANI  :-  RANI  ♦  RAN 3  (RAN2) ; 

RAN 3  (RAN2)  :-  RANI  :-  IF  RANI  <  OL  THEN  RANI  ♦  0.5L 

ELSE  RANI  -  0.5L; 

RANI  ♦  0.5L 
END  RANDOM; 
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COMMENT: 


TEST  FUNCTIONS 

**************; 

LONG  REAL  PROCEDURE  ROS  (LONG  REAL  ARRAY  X(*);  INTEGER  VALUE  N); 
COMMENT:  SEE  ROSENBROCK  (1960); 

100L*((X(2)  -  X( 1 ) **2) **2 )  +  (1L  -  X(l))**2; 

LONG  REAL  PROCEDURE  S I NG( LONG  REAL  ARRAY  X(*);INTEGER  VALUE  N); 
COMMENT:  SEE  POWELL  (1962); 

(X(l)  ♦  10L*X( 2 ) ) **2  ♦  5L*(X(3)-X(4))**2  ♦  (X(2)-2L*X(3))**4 
♦  10L*(X(1)  -  X(4)  )**4; 

LONG  REAL  PROCEDURE  HEL I X( LONG  REAL  ARRAY  X(*); INTEGER  VALUE  N); 
COMMENT:  SEE  FLETCHER  &  POWELL  (1H63); 

BEGIN  LONG  REAL  R,  T; 

R  :=  LONGSQRT  (X(l)**2  ♦  X(2)**2); 

T  :»  IF  X(l)  »  0  THEN  0.25L  ELSE  LONGARCTAN  (X(2)/X(l) )/(2L* 

3. 1415926535 8979L); 

IF  X(l)  <  0  THEN  T  : *  T  +  0.5L; 

100L*((X(3)  -  10L*T)**2  ♦  (R  -  1L)**2)  +  X(3)**2 
END  HELIX; 

LONG  REAL  PROCEDURE  CUBE( LONG  REAL  ARRAY  X(*); INTEGER  VALUE  N); 
COMMENT:  SEE  LEON  (1966); 

100L*(X(2)  -  X(l)**3)**2  ♦  (1L  -  X(l))**2; 


LONG  REAL  PROCEDURE  BEALE ( LONG  REAL  ARRAY  X(*); INTEGER  VALUE  N); 
COMMENT:  SEE  BEALE  (1958); 

(1.5L  -  X( 1) *( 1L  -  X(2) ) )**2  ♦ 

(2.25L  -  X(1)*(1L  -  X( 2)**2) )**2  ♦ 

(2.625L  -  X( 1)*( 1L  -  X(2)**3) )**2; 


LONG  REAL  PROCEDURE  WATSON  (LONG  REAL  ARRAY  X(*); 
INTEGER  VALUE  N); 

COMMENT:  SEE  KOWALIK  &  OSBORNE  (1968); 


BEGIN  LONG  REAL  S,  T ,  U,  Y; 

S  :-  X(l)**2  ♦  (X( 2)  -  X(l)**2 
FOR  I  :»  2  UNTIL  30  DO 

BEGIN  Y  :-  (I  -  l)/29;  T  :» 
FOR  J  :«  N  -  1  STEP  -1  UNTIL 
U  :-  (N  -  1)*X(N) ; 

FOR  J  :«  N  -  1  STEP  -1  UNTIL 
S  :-  S  ♦  (U  -  T*T  -  1L) **2 
END; 

S 

END  WATSON; 


-  1L) **2; 

X(N) ; 

1  DO  T  :*  X(J)  +  Y*T; 

2  DO  U  :-  (J  -  1)*X(J)  ♦  Y*U; 


LONG  REAL  PROCEDURE  CHEBYQUAD  (LONG  PEAL  ARRAY  X(*); 
INTEGER  VALUE  H) ; 

COMMENT:  SEE  FLETCHER  (1965); 

BEGIN 
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LONG  REAL  F ,  DELTA/  TPLUS; 

BOOLEAN  EVEN; 

LONG  REAL  ARRAY  Y/  Tl/  TMINUS  ( 1 : : N) ; 

DELTA  OL; 

FOR  J  :*  1  UNTIL  N  DO 

BEGIN  Y(J)  ;«  2L*X(J)  -  1L; 

DELTA  :*  DELTA  ♦  Y(J); 

TI(J)  Y(J);  TMINUS(J)  :=  1L 

END; 

F  :«  DELTA**2;  EVEN  :  =  FALSE; 

FOR  I  2  UNTIL  N  DO 

BEGIN  EVEN  "'EVEN;  DELTA  :*  OL; 

FOR  J  1  UNTIL  N  DO 

BEGIN  TPLUS  2L*Y(J)*TI ( J )  -  TMINUS(J); 

DELTA  :=  DELTA  +  TPLUS; 

TMINUS(J)  :*  T I ( J ) ; 

Tl  (J)  :»  TPLUS 
END; 

DELTA  DELTA/N  -  (IF  EVEN  THEN  1/(1  -  1*1)  ELSE  0); 

F  :«  F  ♦  DELTA**2 
END; 

F 

END  CHEBYQUAD; 

LONG  REAL  PROCEDURE  POWELL  (LONG  REAL  ARRAY  X(*); 

INTEGER  VALUE  N); 

COMMENT:  SEE  POWELL  (1964); 

3L  -  1L/(1L  ♦  (X(l)  -  X ( 2 ) )**2)  - 

LONGS IN(0.5L*3.14159265358979L*X(2>*  X(3))-(IF  X ( 2 )  =  0  THEN 
OL  ELSE  LONGEXP( - ((X(l)+X(3))/X(2)  -  2L)**2)); 

LONG  REAL  PROCEDURE  WOOD(LONG  REAL  ARRAY  X(*); INTEGER  VALUE  N); 
COMMENT:  SEE  MCCORMICK  &  PEARSON  (1969)  OR  COLVILLE  (1968); 
1G0L*(X(2)  -  X( 1) **2 ) **2  ♦  (1L  -  X(l))**2  +  90L*(X(4)  - 
X(3)**2)**2  ♦  (1L  -  X(3) )**2  ♦  10.1L*((X(2)  -  1L)**2  +  ( X ( 4 ) 

-  1L)**2)  ♦  19 . 8L*( X( 2 )  -  1L) *(X(4)  -  1L); 

LONG  REAL  PROCEDURE  HILBERT  (LONG  REAL  ARRAY  X(*); 

INTEGER  VALUE  N); 

COMMENT:  COMPUTES  XT.A.X,  WHERE  A  IS  THE  N  BY  N  HILBERT 

MATRIX/  SEE  GREGORY  &  KARNEY  (1969)/  PP.  33/  66; 
BEGIN  LONG  REAL  S/  T; 

S  : «  OL;  FOR  I  :»  1  UNTIL  N  DO 

BEGIN  T  :=  OL;  FOR  J  :»  1  UNTIL  N  DO 
T  :-  T  ♦  X ( J )/ ( I  *  J  -  1); 

S  :»  S  ♦  T*X( I ) 

END; 

S 

END  HILBERT; 
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LONG  REAL  PROCEDURE  TRIDIAG  (LONG  REAL  ARRAY  X(*); 
INTEGER  VALUE  N); 

COMMENT :  COMPUTES  XT.A.X  -  2E1T.X,  WHERE  N  >  1, 


(  1 

-1  0 

0  ...  0) 

(-1 

2  -1 

0  ...  0) 

(  0 

-1  2 

-1  ...  0) 

A  ■ 

(... 

. ) 

(  0 

•  •  • 

-1  2  -1) 

(  0 

•  •  • 

0  -1  2) 

AND  E1T 

-  (1, 

0,  . . 

.  ,  0). 

SEE  GREGORY  &  KARNEY  (1969),  PP.  41,  74; 

BEGIN  LONG  REAL  S; 

S  :=  X(1)*(X(1)  -  X( 2 ) ) ; 

FOR  I  :«  2  UNTIL  N  -  1  DO 

S  :»  S  ♦  X(I)*((X(I)  -  X(l  -  1))  +  (X(l)  -  X(l  ♦  1))); 

S  +  X(N)*(2*X(N)  -  X(N  -  D)  -  2*X(1) 

END  TRIDIAG; 

LONG  REAL  PROCEDURE  BOX  (LONG  REAL  ARRAY  X(*); INTEGER  VALUE  N); 
COMMENT:  SEE  BOX  (1966)  OR  BROWN  &  DENNIS  (1970); 

BEGIN  LONG  REAL  P,  S; 

S  :»  0;  FOR  I  1  UNTIL  10  DO 
BEGIN  P  -1/10; 

S  S  ♦  ( ( LONGEXP( P*X( 1) )  -  (IF  (P*X( 2) )  <  (-40)  THEN  0 
ELSE  LONGEXP(P*X(2) ) ) )  - 
X(3)*(L0NGEXP(P)  -  LONGEXP(10*P)))**2 

END; 

S 

END  BOX; 

COMMENT:  GENERAL  TESTING  PROCEDURE 

*************************; 

PROCEDURE  TEST  ( STRING  (80)  VALUE  S;  LONG  REAL  VALUE  H; 

LONG  REAL  PROCEDURE  F;  INTEGER  VALUE  N); 

BEGIN  LONG  REAL  FMIN;  INTEGER  TIM; 

WRITEC  ");  WRITEC  ");  WRITE(S); 

WRITEC'N  N,  "  H  =",  ROUNDTOREAL(H) ) ;  WRITEC  "); 
COMMENT:  INITIALIZE  RANDOM  NUMBER  GENERATOR;  RANINITU); 

COMMENT:  TIME(2)  RETURNS  CLOCK  TIME  IN  UNITS  OF  26  MICROSEC; 
TIM  :»  TIME(2) ; 

FMIN  :«  PRAXIS  (1 ' -5,  16**(-13),  H,  N,  1,  X,  F,  RANDOM); 

WRITE  ("TIME  (Ml LL I  SEC)  ROUND ( (TIME(  2)  -  TIM)/38.4)); 
WRITEC  ") 

END  TEST; 

COMMENT :  TESTING  PROGRAM 
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LONG  REAL  FMIN,  LAM; 

COMMENT:  INCREASE  DIMENSIONS  FOR  N  >  20; 

LONG  REAL  ARRAY  X(l::20); 

COMMENT:  I NTF I  ELDS  I ZE  CONTROLS  THE  OUTPUT  FORMAT  OF  INTEGERS; 

INTFI ELDSIZE  :-  7; 

X(l)  :*  -1.2L;  X(2)  :-  1L;  FMIN  :=  0; 

TEST  ("ROSENBROCK'S  FUNCTION  WITH  A  PARABOLIC  VALLEY", 1, ROS, 2) 
XC1)  :-  X(2)  :-  3; 

TEST  ("ROSENBROCK * S  FUNCTION  ",  3,  ROS,  2); 

X(l)  :-  X( 2)  :«  8; 

TEST  ("ROSENBROCK'S  FUNCTION",  12,  ROS,  2); 

X(l)  :-  -1;  X(2)  :-  X(3)  :-  0; 

TEST  ("HELIX",  1,  HELIX,  3); 

X(l)  :•  -1.2L;  X(2)  :»  -1; 

TEST  ("CUBE",  1,  CUBE,  2); 

X(l)  :-  X ( 2 )  :■  0.1L; 

TEST  ("BEALE",  1,  BEALE,  2); 

X(l)  :•  0;  X( 2)  1;  X(3)  :-  2; 

TEST  ("POWELL",  1,  POWELL,  3); 

FMIN  :■  0;  XU)  :»  0;  X(2)  :«  10;  X( 3)  :-  20; 

TEST  ("BOX",  20,  BOX,  3); 

X(l)  :-  3L;  X(2)  -1L;  X(3)  :-  OL;  X(4)  :-  1L; 

TEST  ("POWELL'S  FUNCTION  WITH  A  SINGULAR  JACOB  I  AN", 1,S I NG, 4) ; 

FMIN  :-  0;  X(l)  !-  X( 3)  :«  -3;  X(2)  X(4)  :•  -1; 

TEST  ("WOOD",  10^  WOOD,  4); 

FOR  N  :■  2  STEP  2  UNTIL  8  DO 

BEGIN  FOR  I  :-  1  UNTIL  N  DO  X(l)  :-  l/(N  +  1); 

FMIN  :•  IF  N  <  8  THEN  OL  ELSE  0. 0035168737256779L; 

TEST  ("CHEBYQUAD",  0.1,  CHEBYQUAD,  N) 

END; 

FOR  N  :-  6  STEP  3  UNTIL  9  DO 

BEGIN  FOR  I  :*  1  UNTIL  N  DO  X(l)  :-  0; 

FMIN  :■  IF  N  -  6  THEN  0. 00228767005355L  ELSE 

IF  N  -  9  THEN  1.399760138098'-6L  ELSE  OL; 

TEST  ("WATSON",  1,  WATSON,  N) 

END; 

FOR  N  :«  4,  6,  8,  10,  12,  16,  20  DO 
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BEGIN  FOR  I  :  *  1  UNTIL  N  DO  X(l)  OL;  FMIN 
TEST  ("TRIDIAG",  2*N,  TRIDIAG,  N) 

END; 

FMIN  0;  FOR  N  :«  2  STEP  2  UNTIL  12  DO 
BEGIN  FOR  I  :»  i  UNTIL  N  DO  X(l)  1; 

TEST  ("HILBERT",  10,  HILBERT,  M) 

END 
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